No David. By increasing the value or I can set to -1 to make it unlimited but still I cannot assure that my whole text can be searchable, which is still a problem with large files because only the part which is indexed will be searchable. Was looking for some alternatives.
Best Regards, Sreedevi S On Tue, Feb 10, 2015 at 2:26 PM, David Pilato <da...@pilato.fr> wrote: > I don’t understand. > If you don’t raise this restriction to a higher value (or to -1), all the > text won’t be extracted so only a subset of the text will be indexed. > Non indexed parts of the text won’t be searchable. > > Did I misunderstand your question? > > -- > David Pilato | Technical Advocate | Elasticsearch.com > @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr < > https://twitter.com/elasticsearchfr> | @scrutmydocs < > https://twitter.com/scrutmydocs> > > > > > Le 10 févr. 2015 à 09:52, sreedevi s <sreedevi.payik...@gmail.com> a > écrit : > > > > Thank you David. Yes, it has a restriction of characters to 10000. > > But for large files, what could be done in that case? > > > > Best Regards, > > Sreedevi S > > > > On Tue, Feb 10, 2015 at 2:04 PM, David Pilato <da...@pilato.fr> wrote: > > > >> If you don’t index content, you won’t be able to search for it I guess. > >> That said, Tika can have this extracted characters limit. See > indexedChars > >> below: > >> > >> tika().parseToString(new BytesStreamInput(content, false), metadata, > >> indexedChars); > >> > >> [1] > >> > https://github.com/elasticsearch/elasticsearch-mapper-attachments/blob/master/src/main/java/org/elasticsearch/index/mapper/attachment/AttachmentMapper.java#L456 > >> < > >> > https://github.com/elasticsearch/elasticsearch-mapper-attachments/blob/master/src/main/java/org/elasticsearch/index/mapper/attachment/AttachmentMapper.java#L456 > >>> > >> > >> -- > >> David Pilato | Technical Advocate | Elasticsearch.com > >> @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr < > >> https://twitter.com/elasticsearchfr> | @scrutmydocs < > >> https://twitter.com/scrutmydocs> > >> > >> > >> > >>> Le 10 févr. 2015 à 09:24, sreedevi s <sreedevi.payik...@gmail.com> a > >> écrit : > >>> > >>> Hi, > >>> Which is the best method to search in attachments in lucene? I am new > >>> to lucene and I am using version 4.10.2. By making use of Tika, I know > I > >>> can convert files to text and then index it as another field. But for > >> large > >>> files that will not be the ideal solution. I believe the maximum > >> characters > >>> per field is 10,000. So, what can be ideal method to search attachments > >> then > >>> > >>> > >>> Best Regards, > >>> Sreedevi S > >> > >> > >