Re: RAM Directory not Serializable in Lucene 4.4 as well as 5.X

2016-01-08 Thread Sanne Grinovero
You could use Infinispan as an adaptor between the Lucene Directory and a JDBC database connection: http://infinispan.org/docs/8.0.x/user_guide/user_guide.html#_infinispan_as_a_storage_for_lucene_indexes Infinispan is primarily meant as an in-memory high perfomance storage but can be used as a "

Re: Is there a way to share IndexReader data sensibly across independent callers?

2016-02-08 Thread Sanne Grinovero
Hi, you should really try to reuse the same opened Directory, like you suggest without closing it until your application is "done" with it in all its threads (normally on application shutdown). Keeping a Directory open will not lead to have open files, that is probably caused by not closing the ins

Re: Lucene cluster with NFS or synchronization tool such as rsync

2016-07-05 Thread Sanne Grinovero
I had a similar need some ~5 years ago, and contributed this Lucene extension to the Infinispan project: - http://infinispan.org/docs/8.2.x/user_guide/user_guide.html#_infinispan_as_a_storage_for_lucene_indexes It since matured and is now being actively maintained by several other people using i

Re: Document retrieval, performance, and DocValues

2016-07-05 Thread Sanne Grinovero
Hi Randy, a first quick and easy win would be to rewrite it as: DocumentStoredFieldVisitor visitor = new DocumentStoredFieldVisitor(Collections.singleton("pos_id”)); for(int i=0; i wrote: > My Lucene index has about 3 million documents and result sets can be large, > often 1000’s and sometimes a

Re: Highlight the whole sentence instead of the partial matching terms

2010-01-11 Thread Sanne Grinovero
If you're searching for terms "giving" and "and", it will only highlight those terms, not the whole sentence.. that's how the highlighter is meant to work: highlight what the user did query. Also there's no built-in concept of sentence. regards, Sanne 2010/1/11 Li Leon : > Just figured out, misse

Re: Using multiple drives and non-CFS format to improve search performance

2010-08-26 Thread Sanne Grinovero
Hi Stefan, you might want to consider org.apache.lucene.store.FileSwitchDirectory before going for the symlinks approach. Sorry I don't know the effect nor recommended file types, I would naively start setting the smallest on SSD, then perform tests, but that's possibly not the best scenario: under

Re: How to close the wrapped directory implementation

2010-09-20 Thread Sanne Grinovero
2010/9/18 Pulkit Singhal : > With RAMDirectory we have the option of providing another Directory > implementation such as FSDirectory that can be wrapped and loaded into > memory: > > Directory directory = new RAMDirectory(FSDirectory.open(new > File(fileDirectoryName))); > > But after building the

Re: Performance problems with lazily loaded fields

2011-03-21 Thread Sanne Grinovero
2011/3/21 Brian Hurt : > I'm having a problem with the performance of lazily-loaded fields with > lucene.  The basic structure of the code is that I get a set of documents > back from a query, then iterate through them, reading most fields to collect > fragments.  This is taking an excessively long

Infinispan & JGroups migrating to Apache License

2013-05-28 Thread Sanne Grinovero
Hello all, as some of you already know the Infinispan project includes several integration points with the Apache Lucene project, including a Directory implementation, but so far we had a separate community because of the license incompatibility. I'm very happy to announce now that both Infinispan

Re: In memory index (current status in Lucene)

2013-07-06 Thread Sanne Grinovero
There is a decent implementation for a fully in-memory Directory in the Infinispan project: https://github.com/infinispan/infinispan/tree/master/lucene This is however not taking advantage of off-heap buffers but storing the index in the heap itself; the reason being that Infinispan can in this ca

Re: Problem deploying in Google App Engine

2013-11-18 Thread Sanne Grinovero
For free deployments I use www.openshift.com but only if the expected load is very low or experimental, otherwise you need a paid for hosting. Sanne On 18 November 2013 04:43, Goutham Tholpadi wrote: > Thanks for the heads up, Uwe! > > Which (free) java web app hosting service do people generall

Re: Serializing RAMDirectory in 4.6.0

2014-01-18 Thread Sanne Grinovero
Hi, I suspect you probably want to do something different. What is your goal? Consider that ultimately a Directory is just wrapping and managing a set of buffers, so you probably want to get to those buffers. -- Sanne On 17 January 2014 23:23, Konstantyn Smirnov wrote: > Hi all, > > In Lucene

Re: [EXTERNAL] Re: general question

2015-04-01 Thread Sanne Grinovero
Hello all, I don't need to do the same, but the suggestions got me curious. Why would you consider it more efficient to iterate on the child scorers, rather than performing an independent Query on each field? (assuming he indexes each {table,column} content in a different field) Thanks, Sanne O

Re: RAMDirectory doesn't win over FSDirectory all the time, why?

2011-06-17 Thread Sanne Grinovero
Hello, I came to similar conclusions, and have a similar comparison test available here: https://github.com/infinispan/infinispan/blob/master/lucene-directory/src/test/java/org/infinispan/lucene/profiling/PerformanceCompareStressTest.java In my test I explicitly run the RAMDirectory first to warmu

Re: distributing the indexing process

2011-06-30 Thread Sanne Grinovero
Hello, you could have each node build a separate index, and then merge the result back in a single consistent index using org.apache.lucene.index.IndexWriter.addIndexes(Directory...) Regards, Sanne 2011/6/30 Guru Chandar : > Thanks for the response. The documents are all distinct. My (limited)

Re: full text searching in cloud for minor enterprises

2011-07-07 Thread Sanne Grinovero
Hello, We can try giving you some directions if you could explain some more details of what you need. First thing, cloud providers are rather different: most allow you to fully control the hosts assigned to you (root access) such as Amazon and Openshift, while others like the google app engine impo

Re: how to do simple search paging results of 100 each? and query syntax question

2011-07-14 Thread Sanne Grinovero
Hello, sorry for the late reply. I don't think that generally noSQL users need a ScrollableResult as usually NoSQL is being used in big data environments, in which case it's preferred to send your computation and data crunching to the data as with Map/Reduce operations (but not limited to) rather t

Re: [WARNING] Index corruption and crashes in Apache Lucene Core / Apache Solr with Java 7

2011-07-29 Thread Sanne Grinovero
Hello, thanks for the warning, that's a pretty nasty bug. A patch was made for OpenJDK, if anybody is interested to try it out that would be great: http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/4e761e7e6e12 Regards, Sanne 2011/7/28 Uwe Schindler : > Hello Apache Lucene & Apache Solr us

Re: Build RAMDirectory on FSDirectory, and then synchronzing the two

2012-01-12 Thread Sanne Grinovero
Maybe you could explain why you are doing this? Someone could suggest alternative approaches. Regards, Sanne On Jan 12, 2012 4:02 AM, "dyzc" <1393975...@qq.com> wrote: > That lies in that my apps add indexes to those in RAM rather than update > them. So the size doubled. Seem not related to the O

Re: Hibernate Search with Regex based on Table

2012-09-17 Thread Sanne Grinovero
Right, you should use the MappingCharFilter from Solr; Hibernate Search can use the Solr tokenizers and filters: http://docs.jboss.org/hibernate/search/4.2/reference/en-US/html_single/#d0e462 To answer your other questions: > In short: Would it be possible to introduce Hibernate Search in the > p