Re: Is there a way to retrieve the a term's position/offset in Solr

2017-03-28 Thread forest_soup
Thanks All! Actually we are going to show the highlighted words in a rich text format instead of the plain text which was indexed. So the hl.fragsize=0 seems not work for me.. And for the patch(SOLR-4722), haven't tried it. Hope it can return the position/offset info. Thanks! -- View this

Re: Fieldtype json supported in SOLR 5.4.0 or 5.4.1

2017-03-28 Thread Rick Leir
Abhijit In Mongo you probably have one JSON record per document. You can post that JSON record to Solr, and the JSON fields get indexed. The github project you mention does just that. If you use the Solr managed schema then Solr will automatically define fields based on what it receives.

Fieldtype json supported in SOLR 5.4.0 or 5.4.1

2017-03-28 Thread Abhijit Pawar
Hello All, I am working on a requirement to index field of type JSON (in mongoDB collection) in SOLR 5.4.0. I am using mongo-jdbc-dih which I found on GitHub : https://github.com/hrishik/solr-mongodb-dih However I could not find a fieldtype on Apache SOLR wiki page which would support JSON

RE: Indexing speed reduced significantly with OCR

2017-03-28 Thread Phil Scadden
Well I haven’t had to deal with a problem that size, but it seems to me that you have little alternative except through more computer hardware at it. For the job I did, I OCRed to convert PDF to searchable PDF outside the indexing workflow. I used pdftotext utility to extract text from pdf. If

Re: losing records during solr updates

2017-03-28 Thread Erick Erickson
Shawn: Two questions: 1> _how_ are you restarting a node? kill -9 is A Bad Thing for instance. Use the 'bin/solr stop' option. 2> How are you indexing? If you're using SolrJ then a successful response should indicate that the raw documents have been written to _all_ active replica's tlogs, so

RE: Is CloudSolrClient thread-safe?

2017-03-28 Thread Mikhail Ibraheem
Thanks Shawn so much. I use SolrJ 6.4.0 and SolrCloud 6.4.0 The code is very simple: I have Spring singleton bean to get one and only one instance of solrCloudClient as: @Value("${zkHost.url:rws00dtr.us.oracle.com:2181,rws00dtr.us.oracle.com:2182,rws00dtr.us.oracle.com:2183}") private

Re: why leader replica does not call HdfsTransactionLog.finish

2017-03-28 Thread Erick Erickson
I'm pretty sure that transaction logs are local to each replica. On Tue, Mar 28, 2017 at 12:35 AM, Yang.Liu wrote: > DistributedUpdateProcessor.java > @Override > public void finish() throws IOException { > assert ! finished : "lifecycle sanity check"; > finished

Re: managing active/passive cores in Solr and Haystack

2017-03-28 Thread Alexandre Rafalovitch
Did you look at: http://chronix.io/ Regards, Alex. http://www.solr-start.com/ - Resources for Solr users, new and experienced On 28 March 2017 at 12:33, serwah sabetghadam wrote: > Dear all, > > Do you know any good reference/best practice for Solr to

Re: managing active/passive cores in Solr and Haystack

2017-03-28 Thread serwah sabetghadam
Dear all, Do you know any good reference/best practice for Solr to work with Time-series data, time-based indexes or retiring data. As I searched it seems to me that we should simulate the configuration ourselves through distributed search. Any help is highly appreciated, Best, Serwah On Fri,

why leader replica does not call HdfsTransactionLog.finish

2017-03-28 Thread Yang.Liu
DistributedUpdateProcessor.java @Override public void finish() throws IOException { assert ! finished : "lifecycle sanity check"; finished = true; if (zkEnabled) doFinish(); if (next != null && nodes == null) next.finish(); } HdfsTranscationLog.finish() will call

Re: Is CloudSolrClient thread-safe?

2017-03-28 Thread Shawn Heisey
On 3/28/2017 8:13 AM, Mikhail Ibraheem wrote: > I have a project with solr and spring. I am have only one instance of > CloudSolrClient at the context (singleton). > > This is because I believe that it should be only one instance to load balance > between the nodes. Yes, the intent is to be

Re: Get a java.lang.NumberFormatException when initializing a SOLR Core

2017-03-28 Thread Alexandre Rafalovitch
Are you absolutely sure that you are starting and restarting Solr in the same way? You could look at overview page to compare properties and at the core/collection's overview page to check the filesystem paths. Regards, Alex. http://www.solr-start.com/ - Resources for Solr users, new and

Re: Indexing speed reduced significantly with OCR

2017-03-28 Thread Walter Underwood
Converting from PDF to text is embarrassingly parallel. You can throw as many machines at it as you want. This is a great time to use a cloud computing service. Need 1000 machines? No problem. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar

Re: Version upgrading approaches

2017-03-28 Thread John Blythe
all good info, appreciate it from you both -- *John Blythe* Product Manager & Lead Developer 251.605.3071 | j...@curvolabs.com www.curvolabs.com 58 Adams Ave Evansville, IN 47713 On Tue, Mar 28, 2017 at 8:54 AM, Shawn Heisey wrote: > On 3/27/2017 5:51 AM, John Blythe

Is CloudSolrClient thread-safe?

2017-03-28 Thread Mikhail Ibraheem
Hi, I have a project with solr and spring. I am have only one instance of CloudSolrClient at the context (singleton). This is because I believe that it should be only one instance to load balance between the nodes. I have big issue which doesn't occur all the times, it is intermittent:

AW: AW: Newbie in Solr

2017-03-28 Thread Ercan Karadeniz
as you can see Alexandre has replied to the wrong thread, see below, from there come the confusion [cid:50a16315-c5c9-493f-875b-8f8d5a554ac0] Von: Shawn Heisey Gesendet: Dienstag, 28. März 2017 15:00 An: solr-user@lucene.apache.org

Re: Get a java.lang.NumberFormatException when initializing a SOLR Core

2017-03-28 Thread robg
I want to add the following to my original post: I posted this problem on the SOLRNet forum but have gotten no guidance there so I hope the users of the SOLR forum might be able to assist. -- View this message in context:

AW: AW: Newbie in Solr

2017-03-28 Thread Ercan Karadeniz
Hi Shawn, I'm going to create a new thread, sorry for the confusion. Thx! Regards, Ercan Von: Shawn Heisey Gesendet: Dienstag, 28. März 2017 15:00 An: solr-user@lucene.apache.org Betreff: Re: AW: Newbie in Solr On 3/28/2017 5:35 AM,

Get a java.lang.NumberFormatException when initializing a SOLR Core

2017-03-28 Thread robg
I have a simple Visual Studio 2013 c# application using SOLRNET to test indexing a document by a standalone single core SOLR instance (v6.4.1. or v6.4.2 running on Windows 2008 R2). The document class is a simple one, having only 2 fields - a c# string ID (a SOLR string & a unique field) and a c#

Re: Is there a way to retrieve the a term's position/offset in Solr

2017-03-28 Thread simon
You might want to take a look at the patch in https://issues.apache.org/jira/browse/SOLR-4722 - 'Highlighter which generates a list of query term position(s) for each item in a list of documents, or returns null if highlighting is disabled.' I've used it for retrieving the term positions with no

Re: AW: Newbie in Solr

2017-03-28 Thread Shawn Heisey
On 3/28/2017 5:35 AM, Ercan Karadeniz wrote: > > thanks for the info. I guess it was not me  rather than Alexandre. > > Now I have adjusted the mail subject and someone can respond to my > question. https://home.apache.org/~hossman/#threadhijack Showing your message in context, lost in another

Re: Version upgrading approaches

2017-03-28 Thread Shawn Heisey
On 3/27/2017 5:51 AM, John Blythe wrote: > The new versions of solr come out in pretty regular fashion. We are > currently on 6.0. I'm curious what drives you / your team to run the > upgrades when you do. Particular features or patches you're > eyeballing? Only concerned w major releases? Some

Re: Closed connection issue while doing dataimport

2017-03-28 Thread Shawn Heisey
On 3/27/2017 7:13 PM, santosh sidnal wrote: > i am facing closed connection issue while doing dataimporter, any solution > to this> stack trace is as below > > > [3/27/17 8:54:41:399 CDT] 00b4 OracleDataSto > findMappingClass for : > Entry >

AW: Newbie in Solr

2017-03-28 Thread Ercan Karadeniz
Hi Shawn, thanks for the info. I guess it was not me [] rather than Alexandre. Now I have adjusted the mail subject and someone can respond to my question. Thanks! Best regards, Ercan Von: Shawn Feldman Gesendet: Montag, 27. März

Re: Is there a way to retrieve the a term's position/offset in Solr

2017-03-28 Thread Bjarke Buur Mortensen
Well, you can get Solr to highlight the entire field if that's what you are after by setting: hl.fragsize=0 From https://cwiki.apache.org/confluence/display/solr/Highlighting#Highlighting-Usage : Specifies the approximate size, in characters, of fragments to consider for highlighting. *0*

Re: Exception in export handler when using fq collapse that only returns one result

2017-03-28 Thread David Svånå
Thanks for the answer - I'll file a bug in JIRA on this if there are no other comments. On Sat, Mar 25, 2017 at 4:44 PM, Joel Bernstein wrote: > I would consider this a bug. Collapse has never really been tested with > export. But this would certainly speed up the unique

Re: Schema API: Modify Unique Key

2017-03-28 Thread nabil Kouici
Thank you for your replay. We would like to have this functionality in order to change unique key to do a partial update.Partial update cannot work without a unique key and our need is to do like in SQL (update documents set documents.field1=wyz where  documents.field2 = xxx). So we put field2

Re: Indexing speed reduced significantly with OCR

2017-03-28 Thread Zheng Lin Edwin Yeo
Hi, Do you have suggestions that we can do to cope with the expensive process of indexing documents which requires OCR. For my current situation, the indexing takes about 2 weeks to complete. If the average indexing speed is say to be 50 times slower, it means it will require 100 weeks to index

Re: Is there a way to retrieve the a term's position/offset in Solr

2017-03-28 Thread forest_soup
Thanks Eric. Actually solr highlighting function does not meet my requirement. My requirement is not showing the highlighted words in snippets, but show them in the whole opening document. So I would like to get the term's position/offset info from solr. I went through the highlight feature, but