[jira] Created: (LUCENE-2244) Improve StandardTokenizer's understanding of non ASCII punctuation and quotes

2010-01-30 Thread Andi Vajda (JIRA)
Improve StandardTokenizer's understanding of non ASCII punctuation and quotes
-

 Key: LUCENE-2244
 URL: https://issues.apache.org/jira/browse/LUCENE-2244
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Affects Versions: 3.0
Reporter: Andi Vajda


In the vein of LUCENE-1126 and LUCENE-1390, StandardTokenizerImpl.jflex should 
do a better job at understanding non-ASCII punctuation characters.

For example, its understanding of the single-quote character ' is currently 
limited to that character only. It will set a token's type to APOSTROPHE only 
if the ' was used.
In the patch attached, I added all the characters that ASCIIFoldingFilter would 
change into '.

I'm not sure that this is the right approach so I didn't write a complete patch 
for all the other hardcoded characters used in jflex rules such as ., - 
which have some variants in ASCIIFoldingFilter that could be used as well.

Maybe a better approach would be to make it possible to have an 
ASCIIFoldingFilter-like reader as a character filter that could be in inserted 
in front of StandardTokenizer ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2244) Improve StandardTokenizer's understanding of non ASCII punctuation and quotes

2010-01-30 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda updated LUCENE-2244:
---

Attachment: StandardTokenizerImpl.jflex.diff

A patch expanding the understanding of the single-quote character to the 
characters that ASCIIFoldingFilter turns into '.

 Improve StandardTokenizer's understanding of non ASCII punctuation and quotes
 -

 Key: LUCENE-2244
 URL: https://issues.apache.org/jira/browse/LUCENE-2244
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Affects Versions: 3.0
Reporter: Andi Vajda
 Attachments: StandardTokenizerImpl.jflex.diff


 In the vein of LUCENE-1126 and LUCENE-1390, StandardTokenizerImpl.jflex 
 should do a better job at understanding non-ASCII punctuation characters.
 For example, its understanding of the single-quote character ' is currently 
 limited to that character only. It will set a token's type to APOSTROPHE only 
 if the ' was used.
 In the patch attached, I added all the characters that ASCIIFoldingFilter 
 would change into '.
 I'm not sure that this is the right approach so I didn't write a complete 
 patch for all the other hardcoded characters used in jflex rules such as ., 
 - which have some variants in ASCIIFoldingFilter that could be used as well.
 Maybe a better approach would be to make it possible to have an 
 ASCIIFoldingFilter-like reader as a character filter that could be in 
 inserted in front of StandardTokenizer ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1580) ISOLatin1AccentFilter does not handle Turkish (UTF-8) chars correctly.

2009-03-28 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda resolved LUCENE-1580.


Resolution: Duplicate

See https://issues.apache.org/jira/browse/LUCENE-1390

 ISOLatin1AccentFilter does not handle Turkish (UTF-8) chars correctly.
 --

 Key: LUCENE-1580
 URL: https://issues.apache.org/jira/browse/LUCENE-1580
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Digy
Priority: Minor
 Attachments: ISOLatin1AccentFilter.patch


 Below mappings  are missing
 Ğ -- G
 ğ -- g
 İ -- I
 ı -- i
 Ş -- S
 ş -- s
 DIGY

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-06 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12654160#action_12654160
 ] 

Andi Vajda commented on LUCENE-1390:


Thanks Mark !


 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Assignee: Mark Miller
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ASCIIFoldingFilter.patch


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-03 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12652875#action_12652875
 ] 

Andi Vajda commented on LUCENE-1390:



Ah, I see now what you're asking for. Sorry about the misunderstanding.
I believe I had picked 'e' for schwa because it looks closest to that 
letter. I have no objections to switching to using 'a' instead if that's 
more correct.
This Wikipedia seems to agree: http://en.wikipedia.org/wiki/Schwa_(Cyrillic)
This other Wikipedia http://en.wikipedia.org/wiki/Schwa is less clear about 
this, but it seems that using 'a' instead of 'e' doesn't contradict it.

Steven, I can amend the patch but you said you had more changes coming. If 
that's the case, could you please add this change as well. If that's not the 
case, is it ok for me to add this change and call for this bug to be 
committed to trunk and closed ?


 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-03 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12652911#action_12652911
 ] 

Andi Vajda commented on LUCENE-1390:


Great, I'll include Robert's change and try to convince a committer to 
finalize it.


 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-03 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12653045#action_12653045
 ] 

Andi Vajda commented on LUCENE-1390:




This class includes all of ISOLatin1AccentFilter.

Still, a difference in behaviour could be seen when using the new 
filter with characters getting converted now that didn't before.

If that sort of lack of backwards compatibility is something we don't want 
to impose on the 3.0 release then the ISOLatin1AccentFilter class needs to 
be preserved.

Thanks for volunteering to finalize this bug !


 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Assignee: Mark Miller
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-03 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda updated LUCENE-1390:
---

Attachment: ASCIIFoldingFilter.patch

This latest version supercedes the previous one and moves all schwa characters 
to the 'A' or 'a' depending on their case. 0259, lowercase schwa, was missing 
and thus added.

 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Assignee: Mark Miller
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ASCIIFoldingFilter.patch, ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-03 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12653123#action_12653123
 ] 

Andi Vajda commented on LUCENE-1390:


Mark, I attached a new version of the patch with Robert's change.

As for the deprecation of ISOLatin1AccentFilter.java, I don't have a definite 
opinion on this.
It's pretty much redundant with what this new class does. If the maintenance 
overhead is not too bad then keeping the duplication around may be worth the 
effort to preserve some backwards compat.

Thanks for taking this from here !
Andi..

 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Assignee: Mark Miller
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ASCIIFoldingFilter.patch


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-03 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda updated LUCENE-1390:
---

Attachment: (was: ISOLatinAccentFilter.java)

 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Assignee: Mark Miller
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ASCIIFoldingFilter.patch


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-03 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12653139#action_12653139
 ] 

Andi Vajda commented on LUCENE-1390:





Yep, I'm leaning that way too.


 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Assignee: Mark Miller
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ASCIIFoldingFilter.patch


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-12-02 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12652694#action_12652694
 ] 

Andi Vajda commented on LUCENE-1390:


Could you please attach a patch for the change you requested, I'm not sure 
it's displaying correctly here. You seem to asking about a change for the 
mapping of AE and E+acute which is unexpected. Thanks !



 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-10-28 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12643152#action_12643152
 ] 

Andi Vajda commented on LUCENE-1390:


Wow, Steve, I'm impressed. This is quite an improvement over my earlier patches 
and even more of an improvement over ISOLatin1AccentFilter. Thank you for doing 
this !
What's next ? Does any Lucene committer watching this bug have objections in 
checking this in ?
One (minor) missing piece to the patch is the deprecation of 
ISOLatin1AccentFilter itself.

 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.9

 Attachments: ASCIIFoldingFilter.patch, ASCIIFoldingFilter.patch, 
 ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-09-18 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda updated LUCENE-1390:
---

Attachment: ISOLatinAccentFilter.java

ISOLatinAccentFilter.java again, now with Unicode Latin Extended B as well.

 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
 Attachments: ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-09-18 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12632458#action_12632458
 ] 

Andi Vajda commented on LUCENE-1390:




I think that would be a whole lot of typing :)
Not a bad idea, still.
I'm in the process of entering the 1E00 - 1EFF range.
The Extended-C and D blocks also have relevant things to include but I'm 
hoping to stop at the Extended Additional block currently in progress.


 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.9

 Attachments: ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-09-17 Thread Andi Vajda (JIRA)
add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter


 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda


The ISOLatin1AccentFilter is removing accents from accented characters in the 
ISO Latin 1 character set.
It does what it does and there is no bug with it.

It would be nicer, though, if there was a more comprehensive version of this 
code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 and 
Latin Extended A unicode blocks.
See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block

That way, all languages using roman characters are covered.
A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-09-17 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda updated LUCENE-1390:
---

Attachment: ISOLatinAccentFilter.java

The new ISOLatinAccentFilter class, superceding ISOLatin1AccentFilter.

 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
 Attachments: ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1390) add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter

2008-09-17 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12631946#action_12631946
 ] 

Andi Vajda commented on LUCENE-1390:



Makes sense.


I did look at that block and it looked much more remote from the purpose of 
this class. But you're right, many of these could be handled as well.

And I agree that they should be handled to be able to claim to be doing a 
complete job.

So far, I've claimed that this class handles Latin 1 and Latin Extended A 
which should cover most, if not all, european/turkish languages using latin 
script and thus goes much farther than the ISOLatin1AccentFilter in that 
respect.


 add ISOLatinAccentFilter and deprecate ISOLatin1AccentFilter
 

 Key: LUCENE-1390
 URL: https://issues.apache.org/jira/browse/LUCENE-1390
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
 Environment: any
Reporter: Andi Vajda
 Attachments: ISOLatinAccentFilter.java


 The ISOLatin1AccentFilter is removing accents from accented characters in the 
 ISO Latin 1 character set.
 It does what it does and there is no bug with it.
 It would be nicer, though, if there was a more comprehensive version of this 
 code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 
 and Latin Extended A unicode blocks.
 See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
 See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block
 That way, all languages using roman characters are covered.
 A new class, ISOLatinAccentFilter is attached. It is intended to supercede 
 ISOLatin1AccentFilter which should get deprecated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1339) Add IndexReader.acquire() and release() methods using IndexReader's ref counting

2008-07-19 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12615001#action_12615001
 ] 

Andi Vajda commented on LUCENE-1339:


That would work just as well !
Andi..


 Add IndexReader.acquire() and release() methods using IndexReader's ref 
 counting
 

 Key: LUCENE-1339
 URL: https://issues.apache.org/jira/browse/LUCENE-1339
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Andi Vajda
 Fix For: 2.3.2

 Attachments: lucene-1339.patch


 From: 
 http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200807.mbox/[EMAIL 
 PROTECTED]
 I have a server where a bunch of threads are handling search requests. I
 have a another process that updates the index used by the search server and
 that asks the searcher server to reopen its index reader after the updates
 completed.
 When I reopen() the index reader, I also close the old one (if the reopen()
 yielded a new instance). This causes problems for the other threads that
 are currently in the middle of a search request.
 I'd like to propose the addition of two methods, acquire() and release() 
 (attached to this bug report), that increment/decrement the ref count that 
 IndexReader 
 instances currently maintain for related purposes. That ref count prevents 
 the index reader from being actually closed until it reaches zero.
 My server's search threads, thus acquiring and releasing the index reader 
 can be sure that the index reader they're currently using is good until 
 they're done with the current request, ie, until they release() it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1339) Add IndexReader.acquire() and release() methods using IndexReader's ref counting

2008-07-18 Thread Andi Vajda (JIRA)
Add IndexReader.acquire() and release() methods using IndexReader's ref counting


 Key: LUCENE-1339
 URL: https://issues.apache.org/jira/browse/LUCENE-1339
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Andi Vajda
 Fix For: 2.3.2


From: 
http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200807.mbox/[EMAIL 
PROTECTED]

I have a server where a bunch of threads are handling search requests. I
have a another process that updates the index used by the search server and
that asks the searcher server to reopen its index reader after the updates
completed.

When I reopen() the index reader, I also close the old one (if the reopen()
yielded a new instance). This causes problems for the other threads that
are currently in the middle of a search request.

I'd like to propose the addition of two methods, acquire() and release() 
(attached to this bug report), that increment/decrement the ref count that 
IndexReader 
instances currently maintain for related purposes. That ref count prevents 
the index reader from being actually closed until it reaches zero.

My server's search threads, thus acquiring and releasing the index reader 
can be sure that the index reader they're currently using is good until 
they're done with the current request, ie, until they release() it.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1339) Add IndexReader.acquire() and release() methods using IndexReader's ref counting

2008-07-18 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda updated LUCENE-1339:
---

Attachment: lucene-1339.patch

 Add IndexReader.acquire() and release() methods using IndexReader's ref 
 counting
 

 Key: LUCENE-1339
 URL: https://issues.apache.org/jira/browse/LUCENE-1339
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Andi Vajda
 Fix For: 2.3.2

 Attachments: lucene-1339.patch


 From: 
 http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200807.mbox/[EMAIL 
 PROTECTED]
 I have a server where a bunch of threads are handling search requests. I
 have a another process that updates the index used by the search server and
 that asks the searcher server to reopen its index reader after the updates
 completed.
 When I reopen() the index reader, I also close the old one (if the reopen()
 yielded a new instance). This causes problems for the other threads that
 are currently in the middle of a search request.
 I'd like to propose the addition of two methods, acquire() and release() 
 (attached to this bug report), that increment/decrement the ref count that 
 IndexReader 
 instances currently maintain for related purposes. That ref count prevents 
 the index reader from being actually closed until it reaches zero.
 My server's search threads, thus acquiring and releasing the index reader 
 can be sure that the index reader they're currently using is good until 
 they're done with the current request, ie, until they release() it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1234) BoostingTermQuery's BoostingSpanScorer class should be protected instead of package access

2008-03-14 Thread Andi Vajda (JIRA)
BoostingTermQuery's BoostingSpanScorer class should be protected instead of 
package access
--

 Key: LUCENE-1234
 URL: https://issues.apache.org/jira/browse/LUCENE-1234
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.3.1
Reporter: Andi Vajda
Priority: Trivial


Currently, BoostingTermScorer, an inner class of BoostingTermQuery is not 
accessible from outside the search.payloads
making it difficult to write an extension of BoostingTermQuery. The other inner 
classes are protected already, as they should be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-1234) BoostingTermQuery's BoostingSpanScorer class should be protected instead of package access

2008-03-14 Thread Andi Vajda (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andi Vajda updated LUCENE-1234:
---

Attachment: patches-lucene-2.3.1

patch against lucene-2.3.1 sources

 BoostingTermQuery's BoostingSpanScorer class should be protected instead of 
 package access
 --

 Key: LUCENE-1234
 URL: https://issues.apache.org/jira/browse/LUCENE-1234
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.3.1
Reporter: Andi Vajda
Priority: Trivial
 Attachments: patches-lucene-2.3.1


 Currently, BoostingTermScorer, an inner class of BoostingTermQuery is not 
 accessible from outside the search.payloads
 making it difficult to write an extension of BoostingTermQuery. The other 
 inner classes are protected already, as they should be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1234) BoostingTermQuery's BoostingSpanScorer class should be protected instead of package access

2008-03-14 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578976#action_12578976
 ] 

Andi Vajda commented on LUCENE-1234:


The inaccessible class is called BoostingSpanScorer.
The method I'd to override there is the score() method.

 BoostingTermQuery's BoostingSpanScorer class should be protected instead of 
 package access
 --

 Key: LUCENE-1234
 URL: https://issues.apache.org/jira/browse/LUCENE-1234
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.3.1
Reporter: Andi Vajda
Priority: Trivial
 Attachments: patches-lucene-2.3.1


 Currently, BoostingTermScorer, an inner class of BoostingTermQuery is not 
 accessible from outside the search.payloads
 making it difficult to write an extension of BoostingTermQuery. The other 
 inner classes are protected already, as they should be.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1182) SimilarityDelegator is missing a delegating scorePayload() method

2008-02-20 Thread Andi Vajda (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12570903#action_12570903
 ] 

Andi Vajda commented on LUCENE-1182:


Err, I meant to say the handy SimilarityDelegator class


 SimilarityDelegator is missing a delegating scorePayload() method
 -

 Key: LUCENE-1182
 URL: https://issues.apache.org/jira/browse/LUCENE-1182
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.3
Reporter: Andi Vajda
Priority: Minor

 The handy SimilarityDelegator method is missing a scoreDelegator() delegating 
 method.
 The fix is trivial, add the code below at the end of the class:
   public float scorePayload(String fieldName, byte [] payload, int offset, 
 int length)
   {
   return delegee.scorePayload(fieldName, payload, offset, length);
   }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-1182) SimilarityDelegator is missing a delegating scorePayload() method

2008-02-20 Thread Andi Vajda (JIRA)
SimilarityDelegator is missing a delegating scorePayload() method
-

 Key: LUCENE-1182
 URL: https://issues.apache.org/jira/browse/LUCENE-1182
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.3
Reporter: Andi Vajda
Priority: Minor


The handy SimilarityDelegator method is missing a scoreDelegator() delegating 
method.
The fix is trivial, add the code below at the end of the class:

  public float scorePayload(String fieldName, byte [] payload, int offset, int 
length)
  {
  return delegee.scorePayload(fieldName, payload, offset, length);
  }


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-722) DEFAULT spelled DEFALT in MoreLikeThis.java

2006-11-21 Thread Andi Vajda (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-722?page=comments#action_12451809 ] 

Andi Vajda commented on LUCENE-722:
---

Yes, you fixed it in one place but this file is actually duplicated in the 
Lucene source tree.
The bug I filed was about the other occurrence, in the 'queries' contrib module 
since it seems to be the one that is current as implied in the 'queries' module 
readme.txt file.

 DEFAULT spelled DEFALT in MoreLikeThis.java
 ---

 Key: LUCENE-722
 URL: http://issues.apache.org/jira/browse/LUCENE-722
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.0.0
 Environment: all
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.1


 DEFAULT is spelled DEFALT in 
 contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Reopened: (LUCENE-722) DEFAULT spelled DEFALT in MoreLikeThis.java

2006-11-21 Thread Andi Vajda (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-722?page=all ]

Andi Vajda reopened LUCENE-722:
---

 
contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java is 
still wrong.

 DEFAULT spelled DEFALT in MoreLikeThis.java
 ---

 Key: LUCENE-722
 URL: http://issues.apache.org/jira/browse/LUCENE-722
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.0.0
 Environment: all
Reporter: Andi Vajda
Priority: Minor
 Fix For: 2.1


 DEFAULT is spelled DEFALT in 
 contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Created: (LUCENE-722) DEFAULT spelled DEFALT in MoreLikeThis.java

2006-11-21 Thread Andi Vajda (JIRA)
DEFAULT spelled DEFALT in MoreLikeThis.java
---

 Key: LUCENE-722
 URL: http://issues.apache.org/jira/browse/LUCENE-722
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.0.0
 Environment: all
Reporter: Andi Vajda
Priority: Minor


DEFAULT is spelled DEFALT in 
contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-722) DEFAULT spelled DEFALT in MoreLikeThis.java

2006-11-21 Thread Andi Vajda (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-722?page=comments#action_12451697 ] 

Andi Vajda commented on LUCENE-722:
---

http://svn.osafoundation.org/pylucene/trunk/patches.lucene contains a patch 
(among others) to fix this.

 DEFAULT spelled DEFALT in MoreLikeThis.java
 ---

 Key: LUCENE-722
 URL: http://issues.apache.org/jira/browse/LUCENE-722
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Affects Versions: 2.0.0
 Environment: all
Reporter: Andi Vajda
Priority: Minor

 DEFAULT is spelled DEFALT in 
 contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-676) Promote solr's PrefixFilter into Java Lucene's core

2006-09-26 Thread Andi Vajda (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-676?page=all ]

Andi Vajda updated LUCENE-676:
--

Attachment: TestPrefixFilter.java

Here is another attachment by Yura providing the request unit test.

 Promote solr's PrefixFilter into Java Lucene's core
 ---

 Key: LUCENE-676
 URL: http://issues.apache.org/jira/browse/LUCENE-676
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Affects Versions: 2.0.1
Reporter: Andi Vajda
Priority: Trivial
 Attachments: PrefixFilter.java, TestPrefixFilter.java


 Solr's PrefixFilter class is not specific to Solr and seems to be of interest 
 to core lucene users (PyLucene in this case).
 Promoting it into the Lucene core would be helpful.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Updated: (LUCENE-676) Promote solr's PrefixFilter into Java Lucene's core

2006-09-25 Thread Andi Vajda (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-676?page=all ]

Andi Vajda updated LUCENE-676:
--

Attachment: PrefixFilter.java

Attached is a version of PrefixFilter that could be added to the Lucene core as 
submitted by Yura Smolsky, a PyLucene user.

 Promote solr's PrefixFilter into Java Lucene's core
 ---

 Key: LUCENE-676
 URL: http://issues.apache.org/jira/browse/LUCENE-676
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Affects Versions: 2.0.1
Reporter: Andi Vajda
Priority: Trivial
 Attachments: PrefixFilter.java


 Solr's PrefixFilter class is not specific to Solr and seems to be of interest 
 to core lucene users (PyLucene in this case).
 Promoting it into the Lucene core would be helpful.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-507) CLONE -[PATCH] remove unused variables

2006-04-27 Thread Andi Vajda (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-507?page=comments#action_12376874 ] 

Andi Vajda commented on LUCENE-507:
---

My apologies, I didn't notice this until it was mentioned today.

The //required by gcj comment is not something I added or need.
The few patches for gcj support that were added at my request are listed as 
such in the Lucene sources. The main one has to do with gcj's bug 15411 in 
Searcher.java, the other with naming a method 'delete'.
In general, it is easier to use javac or jikes to compile the .java sources to 
.class files and then use gcj on the resulting .class (or .jar) files to 
produce native binaries. Thus, one runs around a number of bugs in the gcj java 
compiler front-end.
Still, there are some patches I need to apply to Lucene in order for it to run 
when compiled with gcj. Some are in QueryParser.java and the first of those 
could be applied to the actual .jj file instead, see here: 
http://svn.osafoundation.org/pylucene/trunk/patches.lucene

The next patches in the file above are because of limitations in gcjh (the Java 
to C++ header file generator) or because exception catching doesn't seem to 
work well with gcj on Windows. Throwing and catching exceptions in Java is not 
such an efficient coding practice when there isn't an actual error, maybe the 
code in FieldInfos.java could be changed then (see patch file above) ?
As for the last patch, well, the java runtime that comes with gcj 3.x doesn't 
implement regex, so PyLucene calls into python's regex support instead.

 CLONE -[PATCH] remove unused variables
 --

  Key: LUCENE-507
  URL: http://issues.apache.org/jira/browse/LUCENE-507
  Project: Lucene - Java
 Type: Improvement

   Components: Search
 Versions: unspecified
  Environment: Operating System: other
 Platform: Other
 Reporter: Steven Tamm
 Assignee: Lucene Developers
 Priority: Minor
  Attachments: Unused.patch

 Seems I'm the only person who has the unused variable warning turned on in 
 Eclipse :-) This patch removes those unused variables and imports (for now 
 only in the search package). This doesn't introduce changes in 
 functionality, but it should be reviewed anyway: there might be cases where 
 the variables *should* be used, but they are not because of a bug.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-555) Index Corruption

2006-04-25 Thread Andi Vajda (JIRA)
[ 
http://issues.apache.org/jira/browse/LUCENE-555?page=comments#action_12376319 ] 

Andi Vajda commented on LUCENE-555:
---

There is an implementation of the Lucene index store that is backed up by 
Berkeley DB. Take a look at the 'db' contrib area: 
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/db/
Using this you can bracket index changes with transaction. Should the cord be 
pulled, you can use Berkeley DB's recovery mechanisms.

 Index Corruption
 

  Key: LUCENE-555
  URL: http://issues.apache.org/jira/browse/LUCENE-555
  Project: Lucene - Java
 Type: Bug

   Components: Index
 Versions: 1.9
  Environment: Linux FC4, Java 1.4.9
 Reporter: dan
 Priority: Critical


 Index Corruption
  output
 java.io.FileNotFoundException: ../_aki.fnm (No such file or directory)
 at java.io.RandomAccessFile.open(Native Method)
 at java.io.RandomAccessFile.init(RandomAccessFile.java:204)
 at 
 org.apache.lucene.store.FSIndexInput$Descriptor.init(FSDirectory.java:425)
 at org.apache.lucene.store.FSIndexInput.init(FSDirectory.java:434)
 at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324)
 at org.apache.lucene.index.FieldInfos.init(FieldInfos.java:56)
 at 
 org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144)
 at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129)
 at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110)
 at 
 org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:674)
 at 
 org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658)
 at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517)
  input
 - I open an index, I read, I write, I optimize, and eventually the above 
 happens. The index is unusable.
 - This has happened to me somewhere between 20 and 30 times now - on indexes 
 of different shapes and sizes.
 - I don't know the reason. But, the following requirement applies regardless.
  requirement
 - Like all modern database programs, there has to be a way to repair an 
 index. Period.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-536) JEDirectory delete issue

2006-04-14 Thread Andi Vajda (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-536?page=all ]
 
Andi Vajda resolved LUCENE-536:
---

Resolution: Fixed

Your changes were integrated and committed (rev 394214).
Please, please, please, in the future when sending fixes in, send a proper 
patch as generated by svn diff. Thanks.

 JEDirectory delete issue
 

  Key: LUCENE-536
  URL: http://issues.apache.org/jira/browse/LUCENE-536
  Project: Lucene - Java
 Type: Bug

   Components: Store
 Reporter: Aaron Donovan
 Priority: Minor
  Attachments: File.java, File.java, JEStoreTest.java, JEStoreTest.java

 JEDirectory is not deleting files properly.  Blocks are left behind due to an 
 error in cursor operations.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Resolved: (LUCENE-482) JE Directory Implementation

2006-01-05 Thread Andi Vajda (JIRA)
 [ http://issues.apache.org/jira/browse/LUCENE-482?page=all ]
 
Andi Vajda resolved LUCENE-482:
---

Fix Version: 1.9
 Resolution: Fixed
  Assign To: Andi Vajda

fixed in rev 366041, 'db' contrib area structure was rearranged to accomodate 
multiple implementations and new Berkeley DB JE contribution by Aaron Donovan 
was added.

 JE Directory Implementation
 ---

  Key: LUCENE-482
  URL: http://issues.apache.org/jira/browse/LUCENE-482
  Project: Lucene - Java
 Type: New Feature
   Components: Store
 Versions: 1.9
 Reporter: Aaron Donovan
 Assignee: Andi Vajda
 Priority: Minor
  Fix For: 1.9
  Attachments: contrib.zip

 I've created a port of DbDirectory to JE

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]