subject:"String bytes can be at most 32766 characters in length\?"

Re: String bytes can be at most 32766 characters in length?

2015-09-03 Thread Zheng Lin Edwin Yeo

Thanks for your advice Alexandre. On 3 September 2015 at 20:29, Alexandre Rafalovitch wrote: > Probably because your signatureField and your fields are the same! You > need to point signatureField at a new (not-ID) field. > > You will still get duplicates, as you requested that in your other > e

Re: String bytes can be at most 32766 characters in length?

2015-09-03 Thread Alexandre Rafalovitch

Probably because your signatureField and your fields are the same! You need to point signatureField at a new (not-ID) field. You will still get duplicates, as you requested that in your other emails, but now you would be able to group on that new signature field. If you have any further problems,

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo

Hi Alexandre, Thanks for pointing out the error. I'm able to get the documents to be indexed after adding in the two processors. However, I'm still seeing all the similar documents being search in the content without being de-duplicated. My content is currently indexed as fieldType=text_general.

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Alexandre Rafalovitch

And that's because you have an incomplete chain. If you look at the full example in solrconfig.xml, it shows: true id false name,features,cat solr.processor.Lookup3Signature Notice, the last two processors. I

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo

Hi Erick, I couldn't really find anything special in the logs. The indexing process just went on normally, but after that when I check the index, there is nothing indexed. This is what I see from the logs. Looks the same as when the indexing works fine. INFO - 2015-09-03 01:24:35.316; [collecti

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Erick Erickson

_How_ does it fail? You must be seeing something in the logs On Wed, Sep 2, 2015 at 8:29 AM, Zheng Lin Edwin Yeo wrote: > Hi Erick, > > Yes, i'm trying out the De-Duplication too. But I'm facing a problem with > that, which is the indexing stops working once I put in the following > De-Dupl

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo

Hi Erick, Yes, i'm trying out the De-Duplication too. But I'm facing a problem with that, which is the indexing stops working once I put in the following De-Duplication code in solrconfig.xml. The problem seems to be with this dedupe line. dedupe true signature false content

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Erick Erickson

Yes, that is an intentional limit for the size of a single token, which strings are. Why not use deduplication? See: https://cwiki.apache.org/confluence/display/solr/De-Duplication You don't have to replace the existing documents, and Solr will compute a hash that can be used to identify identica

String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo

Hi, I would like to check, is the string bytes must be at most 32766 characters in length? I'm trying to do a copyField of my rich-text documents content to a field with fieldType=string to try out my getting distinct result for content, as there are several documents with the exact same content,

Re: String bytes can be at most 32766 characters in length?

Re: String bytes can be at most 32766 characters in length?

Re: String bytes can be at most 32766 characters in length?

Re: String bytes can be at most 32766 characters in length?

Re: String bytes can be at most 32766 characters in length?

Re: String bytes can be at most 32766 characters in length?

Re: String bytes can be at most 32766 characters in length?

Re: String bytes can be at most 32766 characters in length?

String bytes can be at most 32766 characters in length?

9 matches

Site Navigation

Mail list logo

Footer information