Re: [hibernate-dev] [infinispan-dev] [HSearch] DSL for Lucene queries (was: Re: Query module new

2009-09-27 Thread Manik Surtani
Lexers are parsers are not difficult (a simple ANTLR grammar takes  
very little effort, for example), but that is orthoginal to this  
discussion.  I believe the very term DSL is misleading (since DSL  
implies a separate grammar), but this is more for an intuitive builder  
API for query creation.  And yes, while this does imply knowledge of  
the API, it is certainly far less verbose and more expressive than the  
existing query construction mechanism.


Perhaps as a next step, a grammar can be built on top of this.

- Manik


On 25 Sep 2009, at 18:41, Navin Surtani wrote:

Incase this email didn't go to the full dev-list. I got it as a  
separate thread so forwarding on.


Begin forwarded message:


From: johng@gmail.com
Date: 25 September 2009 16:00:53 BST
To: Navin Surtani 
Subject: Re: Re: [hibernate-dev] [infinispan-dev] [HSearch] DSL for  
Lucene  queries (was: Re: Query module new


All,

I think Hardy's original push back came from the first pass' use of  
the decorator pattern to try to come up with a DSL. That really  
isn't much better than knowing the API. The alternate is to come up  
with a more natural language implementation but that leads to  
parsers, lexers, etc... I'm not saying it's not worth while but it  
may be a lot of work.


John Griffin

On Sep 25, 2009 8:12am, Navin Surtani  wrote:
> Just wanted to get this topic re-started again.
>
>
>
>
>
> Essentially what I think this project/DSL/module/thingy-bob is  
thought

>
> to become: -
>
>
>
> A simple package where a user can build Lucene queries without  
having

>
> to know too much about Lucene itself. If I'm headed down the wrong
>
> thought path then just thwack me.
>
>
>
>
>
>
>
> On 26 Aug 2009, at 21:08, Hardy Ferentschik wrote:
>
>
>
> > On Wed, 2009-08-26 at 13:39 +0200, Emmanuel Bernard wrote:
>
> >> I've been thinking about a DSL to build Lucene queries in the  
last

>
> >> day.
>
> >> What do you think of this proposal?
>
> >
>
> > What do you really gain compared to native Lucene queries?
>
>
>
> What's gained I believe is the fact that people can build complex
>
> lucene queries easier. Currently, it's a bit clunky imo so if we
>
> provide a cleaner way to build them it can prove beneficial to any
>
> lucene user (myself included for querying on Infinispan).
>
>
>
> Any other thoughts?
>
>
>
>
>
> > If your API achieves exactly the same as what's possible with  
Lucene

>
> > it is just a 'useless' wrapper.
>
> >
>
> > A wrapper around native Lucene queries would make sense if it  
could

>
> > somehow use some of the Hibernate Search specific meta data. As  
an

>
> > extreme example one could generate some meta classes a la JPA2.  
This

>
> > way
>
> > one could ensure that you can get help with which field names are
>
> > available.
>
> >
>
> > --Hardy
>
> >
>
> > ___
>
> > infinispan-dev mailing list
>
> > infinispan-...@lists.jboss.org
>
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> Navin Surtani
>
>
>
> Intern Infinispan
>
> Intern JBoss Cache Searchable
>
>
>
> ___
>
> hibernate-dev mailing list
>
> hibernate-dev@lists.jboss.org
>
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>


Navin Surtani

Intern Infinispan
Intern JBoss Cache Searchable

___
infinispan-dev mailing list
infinispan-...@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Manik Surtani
ma...@jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org




___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] [infinispan-dev] [HSearch] DSL for Lucene queries (was: Re: Query module new

2009-09-27 Thread Emmanuel Bernard

On 27 sept. 09, at 12:08, Manik Surtani wrote:

> Lexers are parsers are not difficult (a simple ANTLR grammar takes  
> very little effort, for example), but that is orthoginal to this  
> discussion.  I believe the very term DSL is misleading (since DSL  
> implies a separate grammar), but this is more for an intuitive  
> builder API for query creation.  And yes, while this does imply  
> knowledge of the API, it is certainly far less verbose and more  
> expressive than the existing query construction mechanism.

That's untrue. External DSLs imply a separate grammar. Internal DSLs  
imply using the hosting language to create a fluent API.
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] [infinispan-dev] Feedback on Infinispan patch

2009-09-27 Thread Łukasz Moreń
You can try to incease TURNS_NUM (I've tried with 1000) and THREADS_NUM
(200) fields in InfinispanDirectoryTest to make it more propable. Same
problem appears also in InfinispanDirectoryProviderTest

An example stacktrace is:

21:22:44,441 ERROR InfinispanDirectoryTest:142 - Error
java.io.IOException: File [ segments_nl ] for index [ indexName ] was not
found
at
org.hibernate.search.store.infinispan.InfinispanIndexIO$InfinispanIndexInput.(InfinispanIndexIO.java:79)
at
org.hibernate.search.store.infinispan.InfinispanDirectory.openInput(InfinispanDirectory.java:201)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
at
org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:95)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
at
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:227)
at org.apache.lucene.search.IndexSearcher.(IndexSearcher.java:55)
at
org.hibernate.search.test.directoryProvider.infinispan.CacheTestSupport.doReadOperation(CacheTestSupport.java:106)
at
org.hibernate.search.test.directoryProvider.infinispan.InfinispanDirectoryTest$InfinispanDirectoryThread.run(InfinispanDirectoryTest.java:130)

Cheers,
Lukasz

2009/9/27 Sanne Grinovero 

> Hi Łukasz,
> I'm unable to reproduce the problem, you said it happens randomly:
> I've tried several times
> and I'm not getting errors. Do you know something I could do to make it
> happen?
> Could you share a stacktrace?
>
> Anyway if you are confident it's about the segments getting lost when
> they are still being read,
> you could introduce a per-segment counter of usage; like it starts at
> value 1 to mark the segment
> as "most current", gets a +1 vote at each reader opening it, -1
> closing, and -1 deleting.
> Each decrement method should check for the value reaching 0 to really
> delete it,
> and this counting method would be easy to add inside the Directory.
> When opening a new indexReader, you
> 1) get the SegmentsInfo
> 2) increment all counters (eager-lock, verify>0 or retry : set changed
> counters back and get a new SegmentsInfo-->1)
> 3) get the needed segments
>
> Getting a counter should be much faster than getting a segment in case
> the data is downloaded
> from another node, so we can use a different key while still relating
> to the segment.
>
> Sanne
>
> 2009/9/23 Łukasz Moreń :
> > I agree that Infinispan case is not much different from RamDirectory. The
> > major difference is that in RD (also FileDirectory) changes are not
> batched
> > like in ID. If I do not wrap changes in InfinispanDirectory(simple remove
> > tx.begin() from obtain() method and tx.commit() from release() in
> > InfinispanLock), and immediately commit every change made by IW it works
> > well. Hovewer it makes indexing really slower, because of frequent
> > replication to other nodes.
> > Sanne it's good remark that IW commit is kind of flush.
> >
> > I've attached patch with InfinispanDirectory, failing test is
> > testDirectoryWithMultipleThreads in InfinispanDirectoryTest class. It
> fails
> > randomly. I think problem is Infinispan commit on lockRelease() in
> > org.apache.lucene.index.IndexWriter (line 1658) is after IW commit()
> (line
> > 1654).
> >
> >> Is it because, the IndexWriter only clean files if no indexReaders are
> >> reading them (how would that be detected)?
> >
> > It can happen if IndexWriter clean file, and IndexReader try to access
> that
> > cleaned file.
> >
> > 2009/9/23 Sanne Grinovero 
> >>
> >> I agree It should work the same way; The IndexWriter cleans files
> >> whenever it likes to, it doesn't try to detect readers, and this
> >> shouldn't have any effect on the working of readers.
> >> The IndexReader opens the "SegmentsInfo" first, and immediately
> >> after** gets a reference to the segments listed in this SegmentsInfo.
> >> No IndexWriter will ever change an existing segment, only add new
> >> files or eventually delete old ones (segments merge,optimize).
> >> The deletion of segments is the interesting subject: when using Files
> >> it uses "delete at last close", which works because the IR needing it
> >> have it opened already**; when using the RAMDirectory they have a
> >> reference preventing garbage collection.
> >>
> >> ( the two "**" are assuming the same event occurred correctly,
> >> otherwise an exception is thrown at opening)
> >>
> >> When using Infinispan it shouldn't be much different than the
> >> RAMDirectory? so even if the needed segment is deleted, the IR holds a
> >> reference to the Java object locally since it was opened.
> >>
> >>  Łukcasz, do you have some failing test?
> >>
> >> Sanne
> >>
> >> 2009/9/23 Emmanuel Bernard :
> >> > Conceptually I don't understand why it does work in a pure file system
> >> > directory (ie IndexRead

Re: [hibernate-dev] [infinispan-dev] Feedback on Infinispan patch

2009-09-27 Thread johng . sst

Sanne,

That error looks suspiciously familiar to an old Lucene error they had.  
Could they have regressed?


John Griffin

On Sep 27, 2009 2:00pm, Łukasz Moreń  wrote:
You can try to incease TURNS_NUM (I've tried with 1000) and THREADS_NUM  
(200) fields in InfinispanDirectoryTest to make it more propable. Same  
problem appears also in InfinispanDirectoryProviderTest



An example stacktrace is:




21:22:44,441 ERROR InfinispanDirectoryTest:142 - Error
java.io.IOException: File [ segments_nl ] for index [ indexName ] was not  
found
at  
org.hibernate.search.store.infinispan.InfinispanIndexIO$InfinispanIndexInput.(InfinispanIndexIO.java:79)


at  
org.hibernate.search.store.infinispan.InfinispanDirectory.openInput(InfinispanDirectory.java:201)

at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
at  
org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:95)


at  
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
at  
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)

at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)



at org.apache.lucene.index.IndexReader.open(IndexReader.java:227)
at org.apache.lucene.search.IndexSearcher.(IndexSearcher.java:55)
at  
org.hibernate.search.test.directoryProvider.infinispan.CacheTestSupport.doReadOperation(CacheTestSupport.java:106)


at  
org.hibernate.search.test.directoryProvider.infinispan.InfinispanDirectoryTest$InfinispanDirectoryThread.run(InfinispanDirectoryTest.java:130)



Cheers,
Lukasz



2009/9/27 Sanne Grinovero sanne.grinov...@gmail.com>



Hi Łukasz,



I'm unable to reproduce the problem, you said it happens randomly:



I've tried several times


and I'm not getting errors. Do you know something I could do to make it  
happen?



Could you share a stacktrace?





Anyway if you are confident it's about the segments getting lost when



they are still being read,



you could introduce a per-segment counter of usage; like it starts at



value 1 to mark the segment



as "most current", gets a +1 vote at each reader opening it, -1



closing, and -1 deleting.


Each decrement method should check for the value reaching 0 to really  
delete it,



and this counting method would be easy to add inside the Directory.



When opening a new indexReader, you



1) get the SegmentsInfo



2) increment all counters (eager-lock, verify>0 or retry : set changed



counters back and get a new SegmentsInfo-->1)



3) get the needed segments





Getting a counter should be much faster than getting a segment in case



the data is downloaded



from another node, so we can use a different key while still relating



to the segment.





Sanne





2009/9/23 Łukasz Moreń lukasz.mo...@gmail.com>:



> I agree that Infinispan case is not much different from RamDirectory.  
The


> major difference is that in RD (also FileDirectory) changes are not  
batched


> like in ID. If I do not wrap changes in InfinispanDirectory(simple  
remove



> tx.begin() from obtain() method and tx.commit() from release() in



> InfinispanLock), and immediately commit every change made by IW it works



> well. Hovewer it makes indexing really slower, because of frequent



> replication to other nodes.



> Sanne it's good remark that IW commit is kind of flush.



>



> I've attached patch with InfinispanDirectory, failing test is


> testDirectoryWithMultipleThreads in InfinispanDirectoryTest class. It  
fails



> randomly. I think problem is Infinispan commit on lockRelease() in


> org.apache.lucene.index.IndexWriter (line 1658) is after IW commit()  
(line



> 1654).



>



>> Is it because, the IndexWriter only clean files if no indexReaders are



>> reading them (how would that be detected)?



>


> It can happen if IndexWriter clean file, and IndexReader try to access  
that



> cleaned file.



>



> 2009/9/23 Sanne Grinovero sanne.grinov...@gmail.com>



>>



>> I agree It should work the same way; The IndexWriter cleans files



>> whenever it likes to, it doesn't try to detect readers, and this



>> shouldn't have any effect on the working of readers.



>> The IndexReader opens the "SegmentsInfo" first, and immediately



>> after** gets a reference to the segments listed in this SegmentsInfo.



>> No IndexWriter will ever change an existing segment, only add new



>> files or eventually delete old ones (segments merge,optimize).



>> The deletion of segments is the interesting subject: when using Files



>> it uses "delete at last close", which works because the IR needing it



>> have it opened already**; when using the RAMDirectory they have a



>> reference preventing garbage collection.



>>



>> ( the two "**" are assuming the same event occurred correctly,



>> otherwise an exception is thrown at opening)



>>



>> When using Infinispan it shouldn't be much different than the



>> RAMDirectory? so even if the needed segment is