Re: How to merge several Taxonomy indexes

2015-03-23 Thread Gimantha Bandara
Hi Christoph,

My mistake. :) It does the exactly what i need. figured it out later..
Thanks a lot!

On Tue, Mar 24, 2015 at 3:14 AM, Gimantha Bandara  wrote:

> Hi Christoph,
>
> I think TaxonomyMergeUtils is to merge a taxonomy directory and an index
> together (Correct me if I am wrong). Can it be used to merge several
> taxonomyDirectories together and create one taxonomy index?
>
> On Mon, Mar 23, 2015 at 9:19 PM, Christoph Kaser 
> wrote:
>
>> Hi Gimantha,
>>
>> have a look at the class org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils,
>> which does exactly what you need.
>>
>> Best regards,
>> Christoph
>>
>> Am 23.03.2015 um 15:44 schrieb Gimantha Bandara:
>>
>>> Hi all,
>>>
>>> Can anyone point me how to merge several taxonomy indexes? My requirement
>>> is as follows. I have  several taxonomy indexes and normal document
>>> indexes. I want to merge taxonomy indexes together and other document
>>> indexes together and perform search on them. One part I have figured out.
>>> It is easy. To Merge document indexes, all I have to do is create a
>>> MultiReader and pass it to IndexSearcher. But I am stuck at merging the
>>> taxonomy indexes. Is there a way to merge taxonomy indexes?
>>>
>>>
>>
>> --
>> Dipl.-Inf. Christoph Kaser
>>
>> IconParc GmbH
>> Sophienstrasse 1
>> 80333 München
>>
>> www.iconparc.de
>>
>> Tel +49 -89- 15 90 06 - 21
>> Fax +49 -89- 15 90 06 - 49
>>
>> Geschäftsleitung: Dipl.-Ing. Roland Brückner, Dipl.-Inf. Sven Angerer. HRB
>> 121830, Amtsgericht München
>>
>>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
>
> --
> Gimantha Bandara
> Software Engineer
> WSO2. Inc : http://wso2.com
> Mobile : +94714961919
>



-- 
Gimantha Bandara
Software Engineer
WSO2. Inc : http://wso2.com
Mobile : +94714961919


RE: CachingTokenFilter tests fail when using MockTokenizer

2015-03-23 Thread Uwe Schindler
Hi,

One of the problems is CachingTokenFilter not 100% conformant to the 
TokenStream/TokenFilter specs. It is mainly used in Lucene internally for stuff 
like the highlighter, who needs to consume the same TokenStream multiple times. 
But when doing this, the code knows how to handle that. One problem is that 
reset() is wrongly defined: Instead the rewind case should be named rewind(), 
so it behaves correctly and cannot be confused with reset() [which is called 
before consumption automatically, which has side effects]. To me 
CachingTokenFilter is a bug by itself™. This filter is excluded from our random 
tests because of those problems (it never gets tested by TestRandomChains).

The problem in your code is that you wrap the underlying TokenStream with 
CachingTokenFilter inside incrementToken() and consume it, and this confuses 
the whole TS state machine. You should wrap the TokenFilter in the constructor 
with CachingTokenFilter, not too late in incrementToken() [at this point reset 
was already called on the underlying stream, so CachingTokenFilter will do this 
a second time]. This leads to this problem, which may later cause the end() 
problem.

In addition, the TokenFilter does not implement reset() correctly, so the whole 
thing cannot be reused in analyzers.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Spyros Kapnissis [mailto:ska...@yahoo.com.INVALID]
> Sent: Monday, March 23, 2015 11:02 PM
> To: java-user@lucene.apache.org; Ahmet Arslan
> Subject: Re: CachingTokenFilter tests fail when using MockTokenizer
> 
> Hello Ahmet,
> Unfortunately the test still fails with the same error: "end() called before
> incrementToken() returned false!". I am not sure if I am misusing
> CachingTokenFilter, or if it cannot be used with MockTokenizer, since it
> "always calls end() before incrementToken() returns false".
> Spyros
> 
> 
> 
> 
>  On Monday, March 23, 2015 9:12 PM, Ahmet Arslan
>  wrote:
> 
> 
>  Hi Spyros,
> 
> Not 100% sure but I think you should override reset method.
> 
> @Override
> public void reset() throws IOException {
> super.reset();
> 
> cachedInput = null;
> }
> 
> Ahmet
> 
> 
> On Monday, March 23, 2015 1:29 PM, Spyros Kapnissis
>  wrote:
> Hello,
> We have a couple of custom token filters that use CachingTokenFilter
> internally. However, when we try to test them with MockTokenizer so that
> we can have these nice TokenStream API checks that it provides, the tests
> fail with: "java.lang.AssertionError: end() called before incrementToken()
> returned false!"
> 
> Here is a link with a unit test to reproduce the issue:
> https://gist.github.com/spyk/c783c72689410070811b
> Do we misuse CachingTokenFilter? Or is it an issue of MockTonenizer when
> used with CachingTokenFilter?
> Thanks!Spyros
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> 
> 


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: CachingTokenFilter tests fail when using MockTokenizer

2015-03-23 Thread Spyros Kapnissis
Hello Ahmet, 
Unfortunately the test still fails with the same error: "end() called before 
incrementToken() returned false!". I am not sure if I am misusing 
CachingTokenFilter, or if it cannot be used with MockTokenizer, since it 
"always calls end() before incrementToken() returns false".
Spyros




 On Monday, March 23, 2015 9:12 PM, Ahmet Arslan 
 wrote:
   

 Hi Spyros,

Not 100% sure but I think you should override reset method.

@Override
public void reset() throws IOException {
super.reset();

cachedInput = null;
}

Ahmet


On Monday, March 23, 2015 1:29 PM, Spyros Kapnissis  
wrote:
Hello, 
We have a couple of custom token filters that use CachingTokenFilter 
internally. However, when we try to test them with MockTokenizer so that we can 
have these nice TokenStream API checks that it provides, the tests fail with: 
"java.lang.AssertionError: end() called before incrementToken() returned false!"

Here is a link with a unit test to reproduce the issue: 
https://gist.github.com/spyk/c783c72689410070811b
Do we misuse CachingTokenFilter? Or is it an issue of MockTonenizer when used 
with CachingTokenFilter?
Thanks!Spyros

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



  

Re: How to merge several Taxonomy indexes

2015-03-23 Thread Gimantha Bandara
Hi Christoph,

I think TaxonomyMergeUtils is to merge a taxonomy directory and an index
together (Correct me if I am wrong). Can it be used to merge several
taxonomyDirectories together and create one taxonomy index?

On Mon, Mar 23, 2015 at 9:19 PM, Christoph Kaser 
wrote:

> Hi Gimantha,
>
> have a look at the class org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils,
> which does exactly what you need.
>
> Best regards,
> Christoph
>
> Am 23.03.2015 um 15:44 schrieb Gimantha Bandara:
>
>> Hi all,
>>
>> Can anyone point me how to merge several taxonomy indexes? My requirement
>> is as follows. I have  several taxonomy indexes and normal document
>> indexes. I want to merge taxonomy indexes together and other document
>> indexes together and perform search on them. One part I have figured out.
>> It is easy. To Merge document indexes, all I have to do is create a
>> MultiReader and pass it to IndexSearcher. But I am stuck at merging the
>> taxonomy indexes. Is there a way to merge taxonomy indexes?
>>
>>
>
> --
> Dipl.-Inf. Christoph Kaser
>
> IconParc GmbH
> Sophienstrasse 1
> 80333 München
>
> www.iconparc.de
>
> Tel +49 -89- 15 90 06 - 21
> Fax +49 -89- 15 90 06 - 49
>
> Geschäftsleitung: Dipl.-Ing. Roland Brückner, Dipl.-Inf. Sven Angerer. HRB
> 121830, Amtsgericht München
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


-- 
Gimantha Bandara
Software Engineer
WSO2. Inc : http://wso2.com
Mobile : +94714961919


Re: CachingTokenFilter tests fail when using MockTokenizer

2015-03-23 Thread Ahmet Arslan
Hi Spyros,

Not 100% sure but I think you should override reset method.

@Override
public void reset() throws IOException {
super.reset();

cachedInput = null;
}

Ahmet


On Monday, March 23, 2015 1:29 PM, Spyros Kapnissis  
wrote:
Hello, 
We have a couple of custom token filters that use CachingTokenFilter 
internally. However, when we try to test them with MockTokenizer so that we can 
have these nice TokenStream API checks that it provides, the tests fail with: 
"java.lang.AssertionError: end() called before incrementToken() returned false!"

Here is a link with a unit test to reproduce the issue: 
https://gist.github.com/spyk/c783c72689410070811b
Do we misuse CachingTokenFilter? Or is it an issue of MockTonenizer when used 
with CachingTokenFilter?
Thanks!Spyros

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How to merge several Taxonomy indexes

2015-03-23 Thread Christoph Kaser

Hi Gimantha,

have a look at the class 
org.apache.lucene.facet.taxonomy.TaxonomyMergeUtils, which does exactly 
what you need.


Best regards,
Christoph

Am 23.03.2015 um 15:44 schrieb Gimantha Bandara:

Hi all,

Can anyone point me how to merge several taxonomy indexes? My requirement
is as follows. I have  several taxonomy indexes and normal document
indexes. I want to merge taxonomy indexes together and other document
indexes together and perform search on them. One part I have figured out.
It is easy. To Merge document indexes, all I have to do is create a
MultiReader and pass it to IndexSearcher. But I am stuck at merging the
taxonomy indexes. Is there a way to merge taxonomy indexes?




--
Dipl.-Inf. Christoph Kaser

IconParc GmbH
Sophienstrasse 1
80333 München

www.iconparc.de

Tel +49 -89- 15 90 06 - 21
Fax +49 -89- 15 90 06 - 49

Geschäftsleitung: Dipl.-Ing. Roland Brückner, Dipl.-Inf. Sven Angerer. HRB
121830, Amtsgericht München

 



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



How to merge several Taxonomy indexes

2015-03-23 Thread Gimantha Bandara
Hi all,

Can anyone point me how to merge several taxonomy indexes? My requirement
is as follows. I have  several taxonomy indexes and normal document
indexes. I want to merge taxonomy indexes together and other document
indexes together and perform search on them. One part I have figured out.
It is easy. To Merge document indexes, all I have to do is create a
MultiReader and pass it to IndexSearcher. But I am stuck at merging the
taxonomy indexes. Is there a way to merge taxonomy indexes?

-- 
Gimantha Bandara
Software Engineer
WSO2. Inc : http://wso2.com
Mobile : +94714961919


CachingTokenFilter tests fail when using MockTokenizer

2015-03-23 Thread Spyros Kapnissis
Hello, 
We have a couple of custom token filters that use CachingTokenFilter 
internally. However, when we try to test them with MockTokenizer so that we can 
have these nice TokenStream API checks that it provides, the tests fail with: 
"java.lang.AssertionError: end() called before incrementToken() returned false!"

Here is a link with a unit test to reproduce the issue: 
https://gist.github.com/spyk/c783c72689410070811b
Do we misuse CachingTokenFilter? Or is it an issue of MockTonenizer when used 
with CachingTokenFilter?
Thanks!Spyros