Go
Decent Blogging using Foswiki Read on
27 Sep 2005 - 22:10 in tagged , , , by Michael Daum
So here's a poor-mans relatedness feature.
Every BlogEntry has a ``Related'' attribute now which is a list of topic names that are considered to be related somehow to this article. So it is a totall manual approach to stuff like that, i.e. it is 100% the authors responsibility to establish relatedness. Nearly. The list of related articles has a tiny bit of automatism in that the relatedness relation is computed to be reflexive and transitive to a given depth. Heh, plainly, every article that states to be related to another establishes the opposite as well. A simple search to find out who is related to me. Actually we compute this to a depth of two by default, that is every article that is related to an article being related to another article is taken to be related to the current one.

There is a new tag in the TWiki.BlogPlugin RELATEDENTRIES that implements all this in perl. No recursive INCLUDE-SEARCH orgies. Things piled up to ca 500 lines in the BlogPlugin already. IMHO, this is bad news for TWiki's readiness for TWikiApplications.

Anyway, frankly, there is gear to automate to find similar documents using Latent Semantic Indexing, see Wikipedia:Latent_semantic_analysis and a nice hands-on article on perl.com about "Building a Vector Space Search Engine in Perl" by Maciej Ceglowski. This implementation is a very rudimentary in-memory search engine, see Search-VectorSpace. But there's a more advanced search engine by the same author Search-ContextGraph which is based on a spreading activation scheme that performs similar to LSI. Hm, before I will understand what's going on inside this beast I will purchase a copy of "Foundations of Statistical Natural Language Processing" by Christopher D. Manning and Hinrich Schütze. I should have put that onto my shelf long before.


Leave a Reply

You may have to login or register to comment if you haven't already.
r11 - 05 Oct 2007 - 20:55:12 - Main.MichaelDaum
Copyright © 1999-2012 Michael Daum Consulting. All rights reserved. Impressum.