Re: Copy field and regex

2017-12-08 Thread Shawn Heisey
On 12/8/2017 1:03 PM, Erick Erickson wrote: Second, grouping works fine in distributed mode with a couple of restrictions, see the reference guide. Collapse/Expand (an alternative to standard grouping) requires that all the members of a group be on the same shard. In 5.x, distributed grouping

Re: Learning to Rank (LTR) with grouping

2017-12-08 Thread Roopa Rao
Hi Diego, Thank you, I will look into this and see how I could patch this. Thank you for your quick response, Roopa On Fri, Dec 8, 2017 at 5:44 PM, Diego Ceccarelli wrote: > Hi Roopa, > > LTR is implemented using RankQuery, and at the moment grouping doens't >

Re: JSON-B deserialization of Solr-response with highlightning

2017-12-08 Thread Chris Hostetter
: We're started to migrate our integration-framework to move over to : JavaEE JSON-B as default json-serialization /deserialization framework : and now the highlighning component is giving us some troubles. Here's a : constructed example of the JSON response from Solr. Wait .. what? that

Re: Learning to Rank (LTR) with grouping

2017-12-08 Thread Diego Ceccarelli
Hi Roopa, LTR is implemented using RankQuery, and at the moment grouping doens't support RankQuery. I opened a jira item time ago (https://issues.apache.org/jira/browse/SOLR-8776) and I would be happy to receive feedback on that. You can find the code here

RE: FW: Need Help Configuring Solr To Work With Nutch

2017-12-08 Thread Mukhopadhyay, Aratrika
Rick , Thanks for your reply. I do not see any errors or exceptions in the solr logs. I have read that the my schema in nutch needs to match the schema in solr. When I change the schema in in the config directory and restart solr my changes are lost. Leaving the schema alone is the only

Learning to Rank (LTR) with grouping

2017-12-08 Thread Roopa Rao
Hi, I am using grouping and LTR together and the results are not getting re-rank as it does without grouping. I am passing parameter. Does LTR work with grouping on? Solr version 6.5 Thank you, Roopa

Re: FW: Need Help Configuring Solr To Work With Nutch

2017-12-08 Thread Rick Leir
Ara Softcommit might be the default in Solrconfig.xml, and if not then you should probably make it so. Then you need to have a look in solr.log if things are not working as you expect. Cheers -- Rick On December 8, 2017 3:23:35 PM EST, "Mukhopadhyay, Aratrika"

Haystack: The Search Relevance & Cognitive Search Conference

2017-12-08 Thread Doug Turnbull
Join us at Haystack, April 10 & 11 where we discuss advanced technical topics on search relevance and cognitive search! We'll discuss topics on applied relevance engineering with fellow practitioners, in Solr, Elasticsearch, Vespa, and adjacent technologies Topics include: - Learning to Rank -

FW: Need Help Configuring Solr To Work With Nutch

2017-12-08 Thread Mukhopadhyay, Aratrika
Erick, Do I need to set the softCommit = true and prepareCommit to true in my solrconfig ? I am still at a loss as to what is happening. Thanks again for your help. Aratrika From: Mukhopadhyay, Aratrika Sent: Friday, December 08, 2017 11:34 AM To: solr-user

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-08 Thread Erick Erickson
bq: Will TLOG replicas use less network bandwidth? No, probably more bandwidth. TLOG replicas work like this: 1> the raw docs are forwarded 2> the old-style master/slave replication is used So what you do save is CPU processing on the TLOG replica in exchange for increased bandwidth. Since the

Re: Copy field and regex

2017-12-08 Thread Erick Erickson
Grouping does _not_ require docValues, it's just that the with docValues=false, uninverted structure is built on the heap at run time. When docValues=true, the uninverted structure is written to disk at index time and MMapped into the OS's memory space rather than the Java heap. Second, grouping

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-08 Thread Joe Obernberger
Anyone have any thoughts on this?  Will TLOG replicas use less network bandwidth? -Joe On 12/4/2017 12:54 PM, Joe Obernberger wrote: Hi All - this same problem happened again, and I think I partially understand what is going on.  The part I don't know is what caused any of the replicas to

Re: indexing XML stored on HDFS

2017-12-08 Thread Matthew Roth
Thanks Rick, While long term storage of the documents in HDFS is not necessary you do raise that easy access to these documents durning the development phase will be useful. Cassandra, spark-solr I am under the impression that I must be running SolrCloud. At this time I need some of the

Re: Copy field and regex

2017-12-08 Thread Bradley Belyeu
Ah, thank you Erick & Shawn. That makes perfect sense. And yes when this goes to prod it will be distributed. Good point about docValues and needing a single shard, thanks! I’m new to result grouping, so I’m still prototyping that it will work for what I need. On 12/8/17, 12:00 PM, "Erick

Re: indexing XML stored on HDFS

2017-12-08 Thread Cassandra Targett
Matthew, The hadoop-solr project you mention would give you the ability to index files in HDFS. It's a Job Jar, so you submit it to Hadoop with the params you need and it processes the files and sends them to Solr. It might not be the fastest thing in the world since it uses MapReduce but we (I

Re: Copy field and regex

2017-12-08 Thread Erick Erickson
I think you're getting confused by seeing the _stored_ data rather than the indexed data. When you return fields in documents, you get the stored data which is a verbatim copy of the input, no analysis done at all. To see what's in the index (and thus what would be grouped on) look at:

Re: Copy field and regex

2017-12-08 Thread Shawn Heisey
On 12/8/2017 9:56 AM, Bradley Belyeu wrote: > I’m wanting to do a result grouping by the first three characters, period, & > digit(s). For example, docs with the unique keys JHN.3.16 & JHN.3.17 I would > want grouped together. > So my thought was to define another field and then copy the USFM

Re: How extractingrequest handler works?

2017-12-08 Thread Sreenivas.T
Thanks Erick. Im using ManifoldCF to connect to Fileshare and index the content to Solr. So I was thinking to customize Solr's updateProcessor. However, It looks like Manifold CF need to have Tika extracting before indexing to Solr. am not sure what should be our approach. -Sreenivas On 8

Copy field and regex

2017-12-08 Thread Bradley Belyeu
I’m struggling a bit getting a copy field & regex tokenizer to work like I think it should… I have an open source project I’m just starting out with here: https://github.com/youversion/solrcloud I have a uniqueKey field USFM defined as: And a USFM will always be in the pattern of 3 characters

RE: Need Help Configuring Solr To Work With Nutch

2017-12-08 Thread Mukhopadhyay, Aratrika
Hello Erick , This is what I see in the logs : [cid:image001.png@01D37018.62D3CC90] I am sorry it sbeen a while since I worked with solr. I did not do anything to specifically commit the changes to the core. Thanks for your prompt attention to this matter. Aratrika

Re: How extractingrequest handler works?

2017-12-08 Thread Erick Erickson
I wouldn't extend the extracting request handler at all, just run the custom code independently of Solr. This is generally recommended anyway, here's a way to get started: https://lucidworks.com/2012/02/14/indexing-with-solrj/ The database bits are just there because I wanted to talk about both

Re: Need Help Configuring Solr To Work With Nutch

2017-12-08 Thread Erick Erickson
1> do you see update messages in the Solr logs? 2> did you issue a commit? Best, Erick On Fri, Dec 8, 2017 at 7:27 AM, Mukhopadhyay, Aratrika < aratrika.mukhopadh...@mail.house.gov> wrote: > Good Morning, > >I am running nutch 2.3 , hbase 0.98 and I am integrating nutch > with solr

Need Help Configuring Solr To Work With Nutch

2017-12-08 Thread Mukhopadhyay, Aratrika
Good Morning, I am running nutch 2.3 , hbase 0.98 and I am integrating nutch with solr 6.4. I have a successful crawl in nutch and when I see that it is indexing the content into solr. However I cannot query and get any results. Its as if Nutch isn't writing anything to solr at all.

Re: Conditional based Filters on SolrConfig.xml

2017-12-08 Thread Shawn Heisey
On 12/8/2017 4:07 AM, sarat chandra wrote: > Currently we have a request handler contains appends option like below > > > inStock:true > > Now i want to append this filter to query on conditional based. > If my request query contains a flag or if the flag is true, i need to > append the above

Conditional based Filters on SolrConfig.xml

2017-12-08 Thread sarat chandra
HI Currently we have a request handler contains appends option like below inStock:true Now i want to append this filter to query on conditional based. If my request query contains a flag or if the flag is true, i need to append the above filter to query, otherwise the filter should not

How to perform delta-import on SolrCloud mode through a scheduler?

2017-12-08 Thread Sabeer Hussain
I am using Solr 7.1 version and deployed it in standalone mode. I have created a scheduler in my application itself to perform delta-import operation based on a pre-configured frequency. I have used the following lines of code (in java) to invoke delta-import operation URL url =

How extractingrequest handler works?

2017-12-08 Thread Sreenivas.T
All, How extractingrequest handler internally indexes tika extracted content? Does it internally calls update processor? I've custom update document processor that need to work on tika extracted content and needs to call an API. Is it that I need to extend that extractingrequesthandler and do