Storing Json field in Lucene
Hi I am currently storing indexed field and stored field in separate database. In stored field database, Document Id, Type and Json string of metadata will be stored. Basically i am using it as key-value pair database. For every document to be indexed, we have three different metadata structure to be stored. That is the reason, we have Document Id and Type, so that we can query and retrieve stored field based on type. We have to depend on Lucene as we don't have any other database to store data. Is it good idea to store complete Json as string to Lucene DB. If we store as separate fields then we have around 30 fields. There will be 30 seeks to get complete stored fields. If we store it as Json then it is a one seek to retrieve the data. Since it is Json, field name and its value will be stored for every record and it may bloat index size. Could you guide me what is the better approach. To store as Json or as individual fields. RegardsGanesh
Re: Storing and retrieving Java objects in Lucene
Hi Santosh >>Furthermore converting the Lucene Documents to Java object and vice- versa is a tedious task. This should not be tedious, how big your document is? One suggestion is to convert your Java object to JSON and store it in Lucene. You need to retrieve one field and you can easily convert back to object. Regards Ganesh On 20-02-2018 08:34, Kumar, Santosh wrote: Hi, I have a requirement to store a Java object with multiple fields into the Lucene index. Basically, at the application startup I run a select query on entities ( there are 5 of them as of now and may increase in future) and then create an index for each of these entities (5) i.e. five different indexes as of now(cannot have a common index. Need separation of entity data). Ideally I would have liked to store only primary key field, but I need rest of the fields upon fetch. I use this index(basically only the primary key field) to prevent users from creating duplicate entities or suggest them like a Did you mean(Google) ? feature . For this purpose, I’m using SpellChecker module to suggest entities or identify duplicates. Since, Spell checker only returns a String array, I again have to run a select separate search on the index(QueryParser search) or run select on the DB to fetch the entire object. Furthermore converting the Lucene Documents to Java object and vice- versa is a tedious task. Is there any API or library that can simplify this task ? I have heard of Compass API, but not sure if it is still recommended. Any examples of the same or APIs will be appreciated. Thank you !!! Thank you and Regards, Santosh - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: Concurrent Execution Exception
I am having 5 shards and all having similar kind of data. This issue is happening in only one shard. I am adding a string field to the index. Its value is numeric ("1301010101") and this field is used for sorting. During search, i am creating a sortfield object with SortField.Int type. This is working fine for most of the customers and till now i didn't faced any issue. While creating SortField, do i need to pass Default Int parser? I guess, currently it is using encoded int parser. Regards Ganesh From: Uwe Schindler To: java-user@lucene.apache.org; 'Ganesh M' Sent: Thursday, February 14, 2013 7:53 PM Subject: RE: Concurrent Execution Exception I have no idea what you are doing, the issue here could be a field with mixed old-style string only numerics and new style numeric fields. If you sort against such a "mixed" field you get this error. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message- > From: Ganesh M [mailto:emailg...@ymail.com] > Sent: Thursday, February 14, 2013 1:43 PM > To: lucene > Subject: Concurrent Execution Exception > > Could any one throw light on this. Im using Lucene 3.0.3. I am having multiple > shards and using ParallelMultiSearcher to search across shards. > > Exception: java.util.concurrent.ExecutionException: > java.lang.NumberFormatException: Invalid shift value in prefixCoded string > (is encoded value really an INT?) ID: 256566961 > org.apache.lucene.search.ParallelMultiSearcher$ExecutionHelper.next(Paral > lelMultiSearcher.java:225) > org.apache.lucene.search.ParallelMultiSearcher.search(ParallelMultiSearche > r.java:127) > org.apache.lucene.search.Searcher.search(Searcher.java:49) > > Regards > Ganesh - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Concurrent Execution Exception
Could any one throw light on this. Im using Lucene 3.0.3. I am having multiple shards and using ParallelMultiSearcher to search across shards. Exception: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: Invalid shift value in prefixCoded string (is encoded value really an INT?) ID: 256566961 org.apache.lucene.search.ParallelMultiSearcher$ExecutionHelper.next(ParallelMultiSearcher.java:225) org.apache.lucene.search.ParallelMultiSearcher.search(ParallelMultiSearcher.java:127) org.apache.lucene.search.Searcher.search(Searcher.java:49) Regards Ganesh