Re: Encryption At Rest - Using CustomAnalyzer

2018-02-05 Thread Michael Wilkowski
Hi,
sorry to say that, but your encryption is not secure at all. Actually it is
very weak. Since you encrypt tokens only (and apply padding) then it is
very easy based on the examples above to actually reverse engineer your
text. If somebody understands the domain, has text distribution and may
build so-called word2vec then he/she may easily use it to build a reverse
dictionary of your tokens.

On the other hand: this means that actually it should not be so difficult
to build wildcard queries (at least with asterisk at the end, not at the
beginning of the word). Check how fuzzy query works right now - it is query
easy to understand and straightforward when looking in source code. I built
my own version of FuzzyQuery some time ago based on MultiTermQuery class.

MW




[image: photo]
*Michael Wilkowski*
Chief Technology Officer, Silent Eight Pte Ltd

+48 600 995 603 | m...@silenteight.com

www.silenteight.com
Get your own email signature
<https://wisestamp.com/email-install?utm_source=promotion&utm_medium=signature&utm_campaign=get_your_own>

On Tue, Feb 6, 2018 at 3:42 AM, aravinth thangasami <
aravinththangas...@gmail.com> wrote:

> Kindly post your suggestions.
>
>
>
> On Mon, Dec 4, 2017 at 11:27 PM, aravinth thangasami <
> aravinththangas...@gmail.com> wrote:
>
> > Hi all,
> >
> > To support Encryption at Rest, We have written a custom analyzer, that
> > encrypts every token in the Input string and proceeds to the default
> > indexing chain
> >
> > We are using AES/CTR/NoPadding with unique Key Per User.
> > This helps that the input string with common prefix, the encrypted
> strings
> > will also get common prefix
> > So that we can perform Prefix Query also.
> >
> > For example,
> >
> > run   x5X7
> > runs  x5X7tg==
> > running x5X7q/nE5g==
> >
> >
> > During searching, we will preprocess the query for encrypted Field before
> > searching
> > we can't do  WildCard & Fuzzy Query
> >
> >
> > Did anyone try this approach?
> > Please post your suggestions and your tried approaches
> >
> >
> > Thanks
> > Aravinth
> >
> >
> >
> >
>


MultiTermQuery vs multiple TermQuery'ies - is there a performance gain?

2017-05-23 Thread Michael Wilkowski
Hi,
I am building an app that will create multiple term queries join with OR
(>100 primitive TermQuery'ies).

Is there a real performance gain implementing custom MultiTermQuery instead
of simply joining multiple TermQuery with OR?

Regards,
MW


Re: Heavy usage of final in Lucene classes

2017-01-12 Thread Michael Wilkowski
Perfect! Thanks, that is what I was looking for :-).

MW


On Thu, Jan 12, 2017 at 12:02 PM, Alan Woodward  wrote:

> Hi Michael,
>
> You want to set the positionIncrementGap - either wrap your analyzer with
> an AnalyzerWrapper that overrides getPositionIncrementGap(), or use a
> CustomAnalyzer builder and set it there.
>
> Alan Woodward
> www.flax.co.uk
>
>
> > On 12 Jan 2017, at 10:57, Michael Wilkowski  wrote:
> >
> > Hi,
> > I wanted to subclass StandardTokenizer to manipulate a little with
> > PositionAttribute. I wanted to increase steps between adjacent fields of
> > the same, so if there is a multi-value TextField:
> >
> > fieldX: "name1 name2",
> > fieldX:"name3 name4"
> >
> > then PhraseQuery like this fieldX:"name2 name3" would not return a
> result.
> > I was forced to create "empty" values like this:
> >
> > fieldX: "name1 name2",
> > fieldX: "EMPTY_VALUE",
> > fieldX:"name3 name4"
> >
> > to achieve it.
> >
> > Regards,
> > MW
> >
> >
> >
> > On Thu, Jan 12, 2017 at 1:10 AM, Michael McCandless <
> > luc...@mikemccandless.com> wrote:
> >
> >> I don't think it's about efficiency but rather about not exposing
> >> possibly trappy APIs / usage ...
> >>
> >> Do you have a particular class/method that you'd want to remove final
> from?
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >>
> >> On Wed, Jan 11, 2017 at 4:15 PM, Michael Wilkowski 
> >> wrote:
> >>> Hi,
> >>> I sometimes wonder what is the purpose of so heavy "final" methods and
> >>> classes usage in Lucene. It makes it my life much harder to override
> >>> standard classes with some custom implementation.
> >>>
> >>> What comes first to my mind is runtime efficiency (compiler "knows"
> that
> >>> this class/method will not be overridden and may create more efficient
> >> code
> >>> without jump lookup tables and with method inlining). Is my assumption
> >>> correct or there are other benefits that were behind this decision?
> >>>
> >>> Regards,
> >>> Michael W.
> >>
>
>


Re: Heavy usage of final in Lucene classes

2017-01-12 Thread Michael Wilkowski
Hi,
I wanted to subclass StandardTokenizer to manipulate a little with
PositionAttribute. I wanted to increase steps between adjacent fields of
the same, so if there is a multi-value TextField:

fieldX: "name1 name2",
fieldX:"name3 name4"

then PhraseQuery like this fieldX:"name2 name3" would not return a result.
I was forced to create "empty" values like this:

fieldX: "name1 name2",
fieldX: "EMPTY_VALUE",
fieldX:"name3 name4"

to achieve it.

Regards,
MW



On Thu, Jan 12, 2017 at 1:10 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> I don't think it's about efficiency but rather about not exposing
> possibly trappy APIs / usage ...
>
> Do you have a particular class/method that you'd want to remove final from?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Jan 11, 2017 at 4:15 PM, Michael Wilkowski 
> wrote:
> > Hi,
> > I sometimes wonder what is the purpose of so heavy "final" methods and
> > classes usage in Lucene. It makes it my life much harder to override
> > standard classes with some custom implementation.
> >
> > What comes first to my mind is runtime efficiency (compiler "knows" that
> > this class/method will not be overridden and may create more efficient
> code
> > without jump lookup tables and with method inlining). Is my assumption
> > correct or there are other benefits that were behind this decision?
> >
> > Regards,
> > Michael W.
>


Heavy usage of final in Lucene classes

2017-01-11 Thread Michael Wilkowski
Hi,
I sometimes wonder what is the purpose of so heavy "final" methods and
classes usage in Lucene. It makes it my life much harder to override
standard classes with some custom implementation.

What comes first to my mind is runtime efficiency (compiler "knows" that
this class/method will not be overridden and may create more efficient code
without jump lookup tables and with method inlining). Is my assumption
correct or there are other benefits that were behind this decision?

Regards,
Michael W.


Re: Lucene performance benchmark | search throughput

2017-01-03 Thread Michael Wilkowski
My guess: more conditions = less documents to score and sort to return.

On Mon, Jan 2, 2017 at 7:23 PM, Rajnish kamboj 
wrote:

> Hi
>
> Is there any Lucene performance benchmark against certain set of data?
> [i.e Is there any stats for search throughput which Lucene can provide for
> a certain data?]
>
> Search throughput Example:
> Max. 200 TPS for 50K data on Lucene 5.3.1 on RHEL version x (with SSD)
> Max. 150 TPS for 100K data on Lucene 5.3.1 on RHEL version x (with SSD)
> Max. 300 TPS for 50K data on Lucene 6.0.0 on RHEL version x (with SSD)
> etc.
>
> Also, does the index size matters for search throughput?
>
> Our observation:
> When we increase the data size (hence index size) the search throughput
> decreases.
> When we add more AND conditions, the search throughput increases. Why?
> Ideally if we add more conditions then the Lucene should have more work to
> do (including merging) and the throughput should decrease but the
> throughput increases?
>
>
> Regards
> Rajnish
>


FuzzyQuery on entire set of terms

2016-10-21 Thread Michael Wilkowski
Hi,
I need to implement a function that performs fuzzy search on multiple terms
in the way that a summarized distance 2 from ALL terms is allowed. For
example query:

Lucene Apache Group

with maximum distance 2 should match:

Luceni Apachi Group
Lucen Apache Group
Luce Apache Group

but not:

Lucen Apach Grou

I know that I can achieve it using multiple FuzzyQueries nested with
BooleanQueries, but in case of more terms (>5) and distance of 2 there
could be many many combinations and I am afraid of performance.

Perhaps there is a better solution that someone may recommend?

Regards,
Michael


Re: Handling multiple locale

2016-09-26 Thread Michael Wilkowski
Hi,
in my opinion your system locales have nothing to do with the analyzers
that you want to apply. I would not rely on system locales as that makes
application very unportable.

Regarding any other way - there are none. You may apply regex query and
create custom queries, but not dynamically refer to fields, because you may
easily do it by hand (as you said by brute force, but you are only limited
by max number of clauses in BooleanQuery).

MW

On Mon, Sep 26, 2016 at 4:10 AM, lukes  wrote:

> 1 more question :). Are numbers analyzed ? Like IntField, LongField, etc. ?
>
> Regards.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Handling-multiple-locale-tp4297805p4297949.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: Handling multiple locale

2016-09-25 Thread Michael Wilkowski
Hi,
please explain I get it correctly: do you want to search your query within
all possible locales? If yes then my personal pattern in such case would be
to create multiple BooleanClause (with Occur.SHOULD, one clause per each
locale) and add them to one BooleanQuery.

MW


On Sun, Sep 25, 2016 at 8:51 PM, lukes  wrote:

> Hi all,
>
>   Any suggestions from the experts ?  I assume, this problem is not coming
> for the first time.
>
> Regards.
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Handling-multiple-locale-tp4297805p4297927.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: Using Lucene to model ownership of documents

2016-06-16 Thread Michael Wilkowski
Definitely b). I would also suggest groups and expanding user groups at
user sign in time.

MW

On Thu, Jun 16, 2016 at 12:36 PM, Ian Lea  wrote:

> I'd definitely go for b).  The index will of course be larger for every
> extra bit of data you store but it doesn't sound like this would make much
> difference.  Likewise for speed of indexing.
>
>
> --
> Ian.
>
>
> On Wed, Jun 15, 2016 at 2:25 PM, Geebee Coder  wrote:
>
> > Hi there,
> > I would like to use Lucene to solve the following problem:
> >
> > 1.We have about 100k customers and we have 25 millions of documents.
> >
> > 2.When a customer performs a text search on the document space, we want
> to
> > return only documents that the customer has access to.
> >
> > 3.The # of documents a customer owns varies a lot. some have close to 23
> > million, some have close to 10k and some own a third of the documents
> etc.
> >
> > What is an efficient way to use Lucene in this scenario in terms of
> > performance and indexing?
> > We have tried a number of solutions such as
> >
> >  a)100k boolean fields per document that indicates whether a customer has
> > access to the document.
> >  b)A single text field that has a list of customers who owns the document
> > e.g. (customers field : "abc abd cfx...")
> > c) the above option with shards by customers
> >
> > The search&index performance for a was bad. b,c performed better for
> search
> > but lengthened the time needed for indexing & index size.
> > We are also thinking about using a custom filter but we are concerned
> about
> > the memory requirements.
> >
> > Any ideas/suggestions would be really appreciated.
> >
>


Re: Cache Lucene based index.

2016-05-21 Thread Michael Wilkowski
I recommend reading

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html?m=1

Just use mmapdirectory and let operating system do the rest.

MW
Sent from Mi phone
On 21 May 2016 12:42, "Prateek Singhal"  wrote:

> You can consider that I want to store the lucene index in some sort of
> temporary memory or a HashMap so that I do not need to index the documents
> every time as it is a costly operation. I can directly return the lucene
> index from that HashMap and use it to answer my queries.
>
> Just want to know if I can access the lucene index object which lucene has
> created so that I can cache it.
>
>
>
> On Sat, May 21, 2016 at 3:46 PM, Uwe Schindler  wrote:
>
> > Hi,
> >
> > What do you mean with "cache"?
> >
> > Uwe
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> > > -Original Message-
> > > From: Prateek Singhal [mailto:prateek.b...@gmail.com]
> > > Sent: Saturday, May 21, 2016 11:27 AM
> > > To: java-user@lucene.apache.org
> > > Subject: Cache Lucene based index.
> > >
> > > Hi Lucene lovers,
> > >
> > > I have a use-case where I want to *create a lucene based index* of
> > multiple
> > > documents and then *want to cache that index*.
> > >
> > > Can anyone suggest if this is possible ?
> > > And which *type of cache* will be most efficient for this use case.
> > >
> > > Also if you can provide me with any *example *of the same then it will
> be
> > > really very helpful.
> > >
> > > Thanks.
> >
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>
>
> --
> Regards,
> Prateek Singhal
> Software Development Engineer @ Amazon.com
>
> "Believe in yourself and you can do unbelievable things."
>


Re: TermRangeQuery work not

2015-12-26 Thread Michael Wilkowski
You mixed lowerDate with upperDate.

MW
Sent from Mi phone
On 25 Dec 2015 16:41, "kaog"  wrote:

> hi
> I did the change of variable "ISBN, it was a mistake I did when I wrote in
> the post. unfortunately still it does not work TermRangeQuery. :(
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/TermRangeQuery-work-not-tp4246519p4247358.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: Wildcard Terms and total word or phrase count

2015-11-29 Thread Michael Wilkowski
Hi Doug,
your attachment is not available (likely security settings). Please put it
in github or somewhere else and provide a link to download.

MW

On Mon, Nov 30, 2015 at 2:29 AM, Kunzman, Douglas * <
douglas.kunz...@fda.hhs.gov> wrote:

>
> Jack -
>
> Thanks a lot for taking the time to try and answer my question.
>
> From using Solr I knew that it needed to be a TextField.
>
> I'm including the entire unit tester as an attachment.
>
> Thanks,
> Doug
>
> -Original Message-
> From: Jack Krupansky [mailto:jack.krupan...@gmail.com]
> Sent: Sunday, November 29, 2015 12:18 PM
> To: java-user@lucene.apache.org
> Subject: Re: Wildcard Terms and total word or phrase count
>
> You didn't post your code that creates the index. Make sure you are using a
> tokenized TextField rather than a single-token StringField.
>
> -- Jack Krupansky
>
> On Fri, Nov 27, 2015 at 4:06 PM, Kunzman, Douglas * <
> douglas.kunz...@fda.hhs.gov> wrote:
>
> > Hi -
> >
> > This is my first Lucene project, my other search projects have used Solr.
> > I would like to find the total number of WildCard terms in a set of
> > documents with 0-N matches per document.
> > I would prefer not have to open each document where a match is found.  I
> > need to be able to support wildcards but my requirements are somewhat
> > flexible in about phrase search support.
> > Whatever is easier.
> >
> > This is what I have so far.
> >
> >public static void main(String args[]) throws IOException,
> > ParseException {
> > Directory idx = FSDirectory.open(path);
> > index("C:\\Users\\Douglas.Kunzman\\Desktop\\test_index");
> >
> > Term term = new Term("Doc", "quar*");
> >
> > WildcardQuery wc = new WildcardQuery(term);
> >
> > SpanQuery spanTerm = new
> > SpanMultiTermQueryWrapper(wc);
> > IndexReader indexReader = DirectoryReader.open(idx);
> >
> > System.out.println("Term freq=" +
> indexReader.totalTermFreq(term));
> > System.out.println("Term freq=" +
> > indexReader.getSumTotalTermFreq("Doc"));
> >
> > IndexSearcher isearcher = new IndexSearcher(indexReader);
> >
> > IndexReaderContext indexReaderContext =
> > isearcher.getTopReaderContext();
> > TermContext context = TermContext.build(indexReaderContext,
> term);
> > TermStatistics termStatistics = isearcher.termStatistics(term,
> > context);
> > System.out.println("termStatics=" +
> > termStatistics.totalTermFreq());
> > }
> >
> > Does anyone have any suggestions?  totalTermFreq is zero, but when search
> > using quartz we find matches.
> > I'm searching the Quartz user's guide as an example.
> >
> > Thanks,
> > Doug
> >
> >
> >
> >
> >
> >
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>


Re: Wildcard Terms and total word or phrase count

2015-11-29 Thread Michael Wilkowski
It is because your index does not contain term quar* and this statistics
function is not a query (you have to pass exact form of the term). To count
terms that meet search criteria you may run search query with custom
collector and count results. Or use normal search query returning TopDocs
and just check totalHitCount (however, first option is faster because no
results are gathered and sorted).

MW
Sent from Mi phone
On 27 Nov 2015 22:06, "Kunzman, Douglas *" 
wrote:

> Hi -
>
> This is my first Lucene project, my other search projects have used Solr.
> I would like to find the total number of WildCard terms in a set of
> documents with 0-N matches per document.
> I would prefer not have to open each document where a match is found.  I
> need to be able to support wildcards but my requirements are somewhat
> flexible in about phrase search support.
> Whatever is easier.
>
> This is what I have so far.
>
>public static void main(String args[]) throws IOException,
> ParseException {
> Directory idx = FSDirectory.open(path);
> index("C:\\Users\\Douglas.Kunzman\\Desktop\\test_index");
>
> Term term = new Term("Doc", "quar*");
>
> WildcardQuery wc = new WildcardQuery(term);
>
> SpanQuery spanTerm = new
> SpanMultiTermQueryWrapper(wc);
> IndexReader indexReader = DirectoryReader.open(idx);
>
> System.out.println("Term freq=" + indexReader.totalTermFreq(term));
> System.out.println("Term freq=" +
> indexReader.getSumTotalTermFreq("Doc"));
>
> IndexSearcher isearcher = new IndexSearcher(indexReader);
>
> IndexReaderContext indexReaderContext =
> isearcher.getTopReaderContext();
> TermContext context = TermContext.build(indexReaderContext, term);
> TermStatistics termStatistics = isearcher.termStatistics(term,
> context);
> System.out.println("termStatics=" +
> termStatistics.totalTermFreq());
> }
>
> Does anyone have any suggestions?  totalTermFreq is zero, but when search
> using quartz we find matches.
> I'm searching the Quartz user's guide as an example.
>
> Thanks,
> Doug
>
>
>
>
>
>


Re: Determine whether a MatchAllQuery or a Query with atleast one Term

2015-11-27 Thread Michael Wilkowski
Instanceof?

MW
Sent from Mi phone
On 28 Nov 2015 06:57, "Sandeep Khanzode" 
wrote:

> Hi,
> I have a question.
> In my program, I need to check whether the input query is a MatchAll Query
> that contains no terms, or a Query (any variant) that has at least one
> term. For typical Term queries, this seems reasonable to be done with
> Query.extractTerms(Set<> terms) which gives the list of terms.
> However, when there is a NumericRangeQuery, this method throws an
> UnsupportedOperationException.
> How can I determine that a NumericRangeQuery or any non-Term query exists
> in the Input Query and differentiate it from the MatchAllQuery? -- SRK


Re: Lucene auto suggest

2015-11-25 Thread Michael Wilkowski
Try some examples from stackoverflow:
http://stackoverflow.com/questions/24968697/how-to-implements-auto-suggest-using-lucenes-new-analyzinginfixsuggester-api

On Wed, Nov 25, 2015 at 4:18 AM, Bhaskar  wrote:

> Could you please some help here?
>
> On Mon, Nov 23, 2015 at 10:50 PM, Bhaskar  wrote:
>
> > Hi,
> > I have one column in the data base and it is having below data( it can
> > have 5000 to 3 rows)
> >
> > Fenway Antenna Dipole Top CH00 with Coaxial Cable Length 140mm
> > Fenway Antenna Dipole Side CH01 with Coaxial Cable Length 220mm
> > Fenway Antenna Slot Front CH02 with Coaxial Cable Length 220mm
> > ANTENNA,C1318-510009-A,GP,SE0810
> > ANTENNA,C1318-510010-A,GP,SE0810
> > ANTENNA,C1318-510011-A,GP,SE0810
> > ANTENNA,MAF94108,GP,SN0905A,WLAN
> > ANTENNA,MAF94119,GP,SN0905A,WLAN
> > ANTENNA,MAF94362,GP,SN0905A,WLAN
> > ANTENNA,MAF94159,GP,SN0906A,WLAN
> > ANTENNA,MAF94195,GP,SN0906A,WLAN
> > ANTENNA,MAF94196,GP,SN0906A,WLAN
> > ANTENNA, STAMPED METAL, BOOST, CHAIN-0
> > ANTENNA, STAMPED METAL, BOOST, CHAIN-2
> > ANVIL DIPOLE ANT0
> > ANVIL DIPOLE ANT1
> > LIMELIGHT ANTENNA-A CABLE
> > LIMELIGHT ANTENNA-B CABLE
> > LIMELIGHT ANTENNA-D CABLE
> >
> >
> > I think I want fuzzy suggestions only... for example...
> > when user types *Fenway  *then the words starting with *Fenway *should
> > come.. i.e.
> >
> > Fenway Antenna Dipole Top CH00 with Coaxial Cable Length 140mm
> > Fenway Antenna Dipole Side CH01 with Coaxial Cable Length 220mm
> > Fenway Antenna Slot Front CH02 with Coaxial Cable Length 220mm
> >
> > Based on the user input the  result should change. if user typed *Fenway
> > Antenna Dipole *then
> >
> > Fenway Antenna Dipole Top CH00 with Coaxial Cable Length 140mm
> > Fenway Antenna Dipole Side CH01 with Coaxial Cable Length 220mm
> >
> > like this based on the typed data then the result should change.
> >
> >
> > Could you please suggest what is the best way to achieve this( may be
> some
> > samples for the same).
> > Please let me know if I miss any info you required.
> >
> > Thank you very much.
> >
> > Regards,
> > Bhaskar
> >
> >
> >
> > On Mon, Nov 23, 2015 at 10:24 PM, Alessandro Benedetti <
> > abenede...@apache.org> wrote:
> >
> >> Can you list us your requirements ?
> >>
> >> Is analysis needed in the suggester ?
> >> Do you want infix suggestions ?
> >> Do you want fuzzy suggestions ?
> >> Suggestions of the whole content of a field or only few tokens ?
> >>
> >> Starting from that you can take a look to the suggester component and
> all
> >> the different implementations.
> >> There are a lot of Lookup strategy, very specific depending on the use
> >> case.
> >>
> >> Cheers
> >>
> >> On 23 November 2015 at 12:39, Bhaskar  wrote:
> >>
> >> > Hi,
> >> >
> >> > I am new Lucene Auto suggest.
> >> > Could you please some share the lucene auto suggest sample
> >> > applications/code..
> >> >
> >> > My use case is:
> >> > I have the data in the database. I would like write some auto suggest
> on
> >> > the data base data.
> >> >
> >> > i.e. we have some text box in UI. when user is trying to enter some
> >> thing
> >> > we have to auto suggest based on the user input.
> >> >
> >> > Thanks in advance for help.
> >> >
> >> > --
> >> > Keep Smiling
> >> > Thanks & Regards
> >> > Bhaskar.
> >> > Mobile:9866724142
> >> >
> >>
> >>
> >>
> >> --
> >> --
> >>
> >> Benedetti Alessandro
> >> Visiting card : http://about.me/alessandro_benedetti
> >>
> >> "Tyger, tyger burning bright
> >> In the forests of the night,
> >> What immortal hand or eye
> >> Could frame thy fearful symmetry?"
> >>
> >> William Blake - Songs of Experience -1794 England
> >>
> >
> >
> >
> > --
> > Keep Smiling
> > Thanks & Regards
> > Bhaskar.
> > Mobile:9866724142
> >
>
>
>
> --
> Keep Smiling
> Thanks & Regards
> Bhaskar.
> Mobile:9866724142
>


Re: does field cache support multivalue?

2015-11-19 Thread Michael Wilkowski
Yes, according to Lucene in Action book, you cannot use field cache in such
situations.

MW


On Fri, Nov 20, 2015 at 8:41 AM, Yonghui Zhao  wrote:

> If I index one filed more than 1 times, it seems I can't get all values
> from lucene field cache?
>
> right?
>


Re: one large index vs many small indexes

2015-11-11 Thread Michael Wilkowski
Hi,
many small indexes seem more reasonable and much more efficient than one
common large index for all customers.

I recommend a very good book Lucene in Action - just reading a first few
chapters (indexing & searching) will give you a very good idea about Lucene
internals, index structure and why separate indexes will be much more
efficient.

Regards,
Michael Wilkowski


On Wed, Nov 11, 2015 at 9:40 AM, Sascha Janz  wrote:

> hello,
>
> we must make a design decision for our system. we have many customers wich
> all should use the same server. now we are thinking about to make a
> separate lucene index for each customer, or to make one large index and use
> a filter for each customer.
>
> any suggestions, comments or expierences on that?
>
> greetings
> Sascha
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>