What are other options for index backup/replication?

2018-04-13 Thread sandesh.yapuram
Hello, I have an application that uses lucene indexes and we generate a single index for all data. I want to take a weekly *local* backup of the index quitely and asynchronously. So far I've tried: 1. The replicator module - I found it very complicated for what I just want to copy and paste indexes

Use multiple Analyzers on same field

2017-07-29 Thread sandesh.yapuram
Hello, I'm using apache lucene 6.3.0 and I want to index my field 'file_name' using 2 Analyzers: 1. StandardAnalyzer (to allow search using terms) 2. KeywordAnalyzer (to preserve the original name also, just in case if the user searches the entire name) Please note that this can be ac

Preserve the original input string/token in while making a CustomAnalyzer

2017-07-28 Thread sandesh.yapuram
Hello, I'm using apache lucene 6.3.0 and I'm trying to implement a custom analyzer for my index which allows searching on filenames. The problem is I want to allow the user to search using the exact filename also, but the Analyzer only has individual tokens and not the original filename as one of

RE: Add more stop characters to StandardAnalyzer

2017-07-28 Thread sandesh.yapuram
Hi Uwi Uwe Schindler wrote > If you have specific requirements, you may use PatternTokenizer or > CharTokenizer.fromSeparatorCharPredicate() as your tokenizer. To make an > Analyzer out of it, use CustomAnalyzer. You have full flexibility! Thank you for the quick response, I went through Patter

Add more stop characters to StandardAnalyzer

2017-07-28 Thread sandesh.yapuram
Hello, I am using lucene 6.3.0 and I am trying to index file names and allow search on them. I'm facing problem because StandardAnalyzer isn't giving me tokens as I was expecting. input: mkt-4-elltvs-101_electrical_load_list.pdf Expected output: mkt 4 e

Get size occupied by each field in lucene index

2017-07-26 Thread sandesh.yapuram
Hello, I'm using lucene 6.3.0 I have an index which has 500k documents with each document having 53 fields. The problem is the index size is becoming an issue day by day so we are planning to weed out or trim some fields. I'm trying to get estimate size of each field using luke but the tool only sh