[algogeeks] Re: Effictive storage of large document-term-matrix?

2006-05-22 Thread Dirk
Hi Gene, thanks for the reply! The majority of operations performed on this matrix will be searching for documents that contain a specific term. Regards, Dirk --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups

[algogeeks] Re: Effictive storage of large document-term-matrix?

2006-05-22 Thread akshay ranjan
Try b-trees. they are quite useful representations of large databases On 5/22/06, Dirk [EMAIL PROTECTED] wrote:Hi Varun,thanks for your reply!Sparse matrix memory implementation sounds like a fit to me. Will give Google a try and find out more about it!Thanks,DirkX-Google-Language:

[algogeeks] Re: Effictive storage of large document-term-matrix?

2006-05-22 Thread Gene
Dirk wrote: Hi Gene, thanks for the reply! The majority of operations performed on this matrix will be searching for documents that contain a specific term. Regards, Dirk A natural implementation would be a table of docname, term pairs. Index the table on term so you can look up

[algogeeks] Re: Effictive storage of large document-term-matrix?

2006-05-21 Thread Gene
Dirk wrote: Hi! I'm looking for an effective way to store a large document-term matrix. The matrix I'm looking at has about 100.000 documents and probably 1.000 terms. Which representation of this matrix would be the most effictive to work with? Putting the whole thing into memory at

[algogeeks] Re: Effictive storage of large document-term-matrix?

2006-05-21 Thread Varun Soundararajan
The Term Document matrix is a perfect example of implementing Sparse matrix memory implementation. I have had examples where I was able to represent a 5000 words cross 50,000 documents (a little more than that), much efficiently using the in-memory representation techniques of sparse memories.