[Lucene.Net] [jira] [Updated] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)

2011-09-15 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-444:
---

Fix Version/s: (was: Lucene.Net 3.x)
   Lucene.Net 2.9.4g

 Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
 

 Key: LUCENENET-444
 URL: https://issues.apache.org/jira/browse/LUCENENET-444
 Project: Lucene.Net
  Issue Type: New Feature
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Some missing stemmers + a modified portuguese stemmer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)

2011-09-15 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-444:
---

Attachment: PortugueseStemmer.cs
TurkishStemmer.cs
RomanianStemmer.cs
HungarianStemmer.cs

 Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
 

 Key: LUCENENET-444
 URL: https://issues.apache.org/jira/browse/LUCENENET-444
 Project: Lucene.Net
  Issue Type: New Feature
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: HungarianStemmer.cs, PortugueseStemmer.cs, 
 RomanianStemmer.cs, TurkishStemmer.cs


 Some missing stemmers + a modified portuguese stemmer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)

2011-09-15 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-444.


Resolution: Fixed

 Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
 

 Key: LUCENENET-444
 URL: https://issues.apache.org/jira/browse/LUCENENET-444
 Project: Lucene.Net
  Issue Type: New Feature
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: HungarianStemmer.cs, PortugueseStemmer.cs, 
 RomanianStemmer.cs, TurkishStemmer.cs


 Some missing stemmers + a modified portuguese stemmer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)

2011-09-15 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105418#comment-13105418
 ] 

Digy commented on LUCENENET-444:


committed to trunk  2.9.4g branch.

 Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
 

 Key: LUCENENET-444
 URL: https://issues.apache.org/jira/browse/LUCENENET-444
 Project: Lucene.Net
  Issue Type: New Feature
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: HungarianStemmer.cs, PortugueseStemmer.cs, 
 RomanianStemmer.cs, TurkishStemmer.cs


 Some missing stemmers + a modified portuguese stemmer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-443) SpellChecker finaliser calls close regardless of if closed already

2011-09-15 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105478#comment-13105478
 ] 

Digy commented on LUCENENET-443:


+1 for IDisposable in 2.9.4g (since 
Analyzers,Searchers,Directories,IndexReader,IndexWriter already implement it).

 SpellChecker finaliser calls close regardless of if closed already
 --

 Key: LUCENENET-443
 URL: https://issues.apache.org/jira/browse/LUCENENET-443
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 2.9.2
Reporter: Stuart Robinson
  Labels: lucene, spellcheck, spellchecker

 The SpellChecker Class currently has no publicly visible way of accessing the 
 closed field. It also calls close in the finaliser killing the process it is 
 in upon GC as this can throw an exceptin. I propose two changes:
 Change the already existing method IsClosed() to public:
 public bool IsClosed()
 {
 return closed;
 }
 and add a check on this in the finaliser:
 ~SpellChecker()
 {
 if (!IsClosed())
 this.Close();
 }
 Ideally this class should implement IDisposable but I think this would be a 
 bigger job than this two line change.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-414) The definition of CharArraySet is dangerously confusing and leads to bugs when used.

2011-09-06 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-414.


Resolution: Fixed

Fixed.

DIGY

 The definition of CharArraySet is dangerously confusing and leads to bugs 
 when used.
 

 Key: LUCENENET-414
 URL: https://issues.apache.org/jira/browse/LUCENENET-414
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: Irrelevant
Reporter: Vincent Van Den Berghe
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Right now, CharArraySet derives from System.Collections.Hashtable, but 
 doesn't actually use this base type for storing elements.
 However, the StandardAnalyzer.STOP_WORDS_SET is exposed as a 
 System.Collections.Hashtable. The trivial code to build your own stopword set 
 using the StandardAnalyzer.STOP_WORDS_SET and adding your own set of 
 stopwords like this:
 CharArraySet myStopWords = new CharArraySet(StandardAnalyzer.STOP_WORDS_SET, 
 ignoreCase: false);
 foreach (string domainSpecificStopWord in DomainSpecificStopWords)
 stopWords.Add(domainSpecificStopWord);
 ... will fail because the CharArraySet accepts an ICollection, which will be 
 passed the Hashtable instance of STOP_WORDS_SET: the resulting myStopWords 
 will only contain the DomainSpecificStopWords, and not those from 
 STOP_WORDS_SET.
 One workaround would be to replace the first line with this:
 CharArraySet stopWords = new 
 CharArraySet(StandardAnalyzer.STOP_WORDS_SET.Count + 
 DomainSpecificStopWords.Length, ignoreCase: false);
 foreach (string domainSpecificStopWord in 
 (CharArraySet)StandardAnalyzer.STOP_WORDS_SET)
 stopWords.Add(domainSpecificStopWord);
 ... but this makes use of the implementation detail (the STOP_WORDS_SET is 
 really an UnmodifiableCharArraySet which is itself a CharArraySet). It works 
 because it forces the foreach() to use the correct 
 CharArraySet.GetEnumerator(), which is defined as a new method (this has a 
 bad code smell to it)
 At least 2 possibilities exist to solve this problem:
 - Make CharArraySet use the Hashtable instance and a custom comparator, 
 instead of its own implementation.
 - Make CharArraySet use HashSetchar[], defined in .NET 4.0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-442) ParallelMultiSearcher threads don't handle all exceptions

2011-09-05 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-442:
---

Attachment: LUCENENET-442.patch

Thanks Andy.
Nice catch.
I prepared a patch for 2.9.4g and will commit to 2.9.4g branch  trunk soon.

DIGY

 ParallelMultiSearcher threads don't handle all exceptions
 -

 Key: LUCENENET-442
 URL: https://issues.apache.org/jira/browse/LUCENENET-442
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
Reporter: Andy Twidle
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: LUCENENET-442.patch


 The ParallelMultiSearcher doesn't allow non-IOException exceptions to be 
 managed by the calling application.  
 LUCENENET-388 worked around one specific example of this, but any genuine 
 Lucene exception (eg: BooleanQuery.TooManyClauses) will also fall foul of 
 this pattern.
 In our specific instance we could treat the symptoms and up the max clause 
 count, but I'm sure there will be more.  Could the System.IOException be 
 generalised to System.Exception?  Or would that be too much deviation from 
 the Java code base?
 --
 Example stack trace of an exception thrown by a Searcher executed:
 Framework Version: v4.0.30319
 Description: The process was terminated due to an unhandled exception.
 Exception Info: Lucene.Net.Search.BooleanQuery+TooManyClauses
 Stack:
at Lucene.Net.Search.BooleanQuery.Add(Lucene.Net.Search.BooleanClause)
at Lucene.Net.Search.BooleanQuery.Add(Lucene.Net.Search.Query, Occur)
at Lucene.Net.Search.PrefixQuery.Rewrite(Lucene.Net.Index.IndexReader)
at Lucene.Net.Search.BooleanQuery.Rewrite(Lucene.Net.Index.IndexReader)
at Lucene.Net.Search.IndexSearcher.Rewrite(Lucene.Net.Search.Query)
at Lucene.Net.Search.Query.Weight(Lucene.Net.Search.Searcher)
at Lucene.Net.Search.Searcher.CreateWeight(Lucene.Net.Search.Query)
at Lucene.Net.Search.Searcher.Search(Lucene.Net.Search.Query, 
 Lucene.Net.Search.Filter, Lucene.Net.Search.HitCollector)
at Lucene.Net.Search.Searcher.Search(Lucene.Net.Search.Query, 
 Lucene.Net.Search.HitCollector)
at Lucene.Net.Search.QueryWrapperFilter.Bits(Lucene.Net.Index.IndexReader)
at 
 Lucene.Net.Search.CachingWrapperFilter.Bits(Lucene.Net.Index.IndexReader)
at Lucene.Net.Search.IndexSearcher.Search(Lucene.Net.Search.Weight, 
 Lucene.Net.Search.Filter, Lucene.Net.Search.HitCollector)
at Lucene.Net.Search.IndexSearcher.Search(Lucene.Net.Search.Weight, 
 Lucene.Net.Search.Filter, Int32)
at Lucene.Net.Search.MultiSearcherThread.Run()
at 
 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, 
 System.Threading.ContextCallback, System.Object, Boolean)
at 
 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, 
 System.Threading.ContextCallback, System.Object)
at System.Threading.ThreadHelper.ThreadStart()

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-358) CloseableThreadLocal memory leak in LocalDataStoreSlot (with workaround)

2011-08-28 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092541#comment-13092541
 ] 

Digy commented on LUCENENET-358:


New CloseableThreadLocal implementation and its test case committed to trunk.

DIGY

 CloseableThreadLocal memory leak  in LocalDataStoreSlot (with workaround)
 -

 Key: LUCENENET-358
 URL: https://issues.apache.org/jira/browse/LUCENENET-358
 Project: Lucene.Net
  Issue Type: Bug
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
 Environment: Microsoft WIndows Server 2008 Enterprise x64. SP2.
 .NET Framework 4.0
Reporter: Rezgar Cadro
Assignee: Digy
Priority: Critical
  Labels: memory, CloseableThreadLocal, LocalDataStoreSlot, leak
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: CloseableThreadLocal MemoryLeak.patch, 
 CloseableThreadLocal.diff, CloseableThreadLocal.diff, 
 CloseableThreadLocal.patch, TestMemLeakage.zip


 Recently we have been suffering from a severe memory leak when executing 
 intense open/close operations on IndexSearcher and IndexModifier. 
 Memory profiling showed that memory is being held by LocalDataStore[] objects.
 After some digging, the root of the problem has been found in 
 CloseableThreadLocal class:
 private System.LocalDataStoreSlot t = 
 System.Threading.Thread.AllocateDataSlot();
 What we see is that every instantiated object of CloseableThreadLocal causes 
 new data slot allocation performed for every thread. 
 Thread.AllocateDataSlot() does not simply allocate a new slot, replacing an 
 old one, but enlarging an existing buffer in-thread, appending data to the 
 end of internal LocalDataStore[] collection, which  causes a severe memory 
 leak .
 As long as t variable is instantiated on every object creation, and (in 
 current class implementation) every object is used by a single thread, 
 replacing private System.LocalDataStoreSlot t = 
 System.Threading.Thread.AllocateDataSlot(); with simple private object 
 dataSlot; and removing hardRefs Dictionary solves the problem and prevents 
 memory leak. 
 We have tried to implement the expected behavior by using [ThreadStatic] 
 attribute instead of LocalDataStoreSlot, but the attempt failed because of 
 unexpected exceptions being thrown.
 Patch can be found at Lucene.Net repository under 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-358) CloseableThreadLocal memory leak in LocalDataStoreSlot (with workaround)

2011-08-27 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-358:
---

Attachment: TestMemLeakage.zip

TestMemLeakage.zip shows that bug.

 CloseableThreadLocal memory leak  in LocalDataStoreSlot (with workaround)
 -

 Key: LUCENENET-358
 URL: https://issues.apache.org/jira/browse/LUCENENET-358
 Project: Lucene.Net
  Issue Type: Bug
 Environment: Microsoft WIndows Server 2008 Enterprise x64. SP2.
 .NET Framework 4.0
Reporter: Rezgar Cadro
Assignee: Digy
Priority: Critical
  Labels: memory, CloseableThreadLocal, LocalDataStoreSlot, leak
 Attachments: CloseableThreadLocal MemoryLeak.patch, 
 CloseableThreadLocal.diff, CloseableThreadLocal.diff, 
 CloseableThreadLocal.patch, TestMemLeakage.zip


 Recently we have been suffering from a severe memory leak when executing 
 intense open/close operations on IndexSearcher and IndexModifier. 
 Memory profiling showed that memory is being held by LocalDataStore[] objects.
 After some digging, the root of the problem has been found in 
 CloseableThreadLocal class:
 private System.LocalDataStoreSlot t = 
 System.Threading.Thread.AllocateDataSlot();
 What we see is that every instantiated object of CloseableThreadLocal causes 
 new data slot allocation performed for every thread. 
 Thread.AllocateDataSlot() does not simply allocate a new slot, replacing an 
 old one, but enlarging an existing buffer in-thread, appending data to the 
 end of internal LocalDataStore[] collection, which  causes a severe memory 
 leak .
 As long as t variable is instantiated on every object creation, and (in 
 current class implementation) every object is used by a single thread, 
 replacing private System.LocalDataStoreSlot t = 
 System.Threading.Thread.AllocateDataSlot(); with simple private object 
 dataSlot; and removing hardRefs Dictionary solves the problem and prevents 
 memory leak. 
 We have tried to implement the expected behavior by using [ThreadStatic] 
 attribute instead of LocalDataStoreSlot, but the attempt failed because of 
 unexpected exceptions being thrown.
 Patch can be found at Lucene.Net repository under 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-441) Encountered: EOF after : \\\\ during parsing a query

2011-08-25 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091175#comment-13091175
 ] 

Digy commented on LUCENENET-441:


What does your query look like? 
What is your question?

 Encountered: EOF after :  during parsing a query
 --

 Key: LUCENENET-441
 URL: https://issues.apache.org/jira/browse/LUCENENET-441
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: .Net Framework 4.0
Reporter: Maverick904

 Cannot parse '\': Lexical error at line 1, column 4.  Encountered: EOF 
 after :  |at Lucene.Net.QueryParsers.QueryParser.Parse(String 
 query)
at Lucene.Net.QueryParsers.MultiFieldQueryParser.Parse(Version 
 matchVersion, String query, String[] fields, Occur[] flags, Analyzer analyzer)
at Lucene.Net.QueryParsers.MultiFieldQueryParser.Parse(String query, 
 String[] fields, Occur[] flags, Analyzer analyzer)

  
 Lexical error at line 1, column 4.  Encountered: EOF after :  |
 at Lucene.Net.QueryParsers.QueryParserTokenManager.GetNextToken()
at Lucene.Net.QueryParsers.QueryParser.Jj_ntk()
at Lucene.Net.QueryParsers.QueryParser.Modifiers()
at Lucene.Net.QueryParsers.QueryParser.Query(String field)
at Lucene.Net.QueryParsers.QueryParser.Parse(String query) | 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Issue Comment Edited] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067389#comment-13067389
 ] 

Digy edited comment on LUCENENET-437 at 7/18/11 11:28 PM:
--

bq. It ensures equality, but does not ensure inequality.
Sorry but I must object again.
It ensures inequality, but doesn't ensure equality.(if hashcodes are not equal 
objects are not also, but having the same hashcode doesn't say anything about 
equality)



  was (Author: digydigy):
bq. It ensures equality, but does not ensure inequality.
Sorry but I must object again.
It ensures inequality, but doesn't ensure equality.


  
 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067409#comment-13067409
 ] 

Digy commented on LUCENENET-437:


Already fixed in 2.9.4g

 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-18 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-437:
---

Fix Version/s: Lucene.Net 2.9.4g

 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067361#comment-13067361
 ] 

Digy commented on LUCENENET-437:



Java Docs says:
public boolean equals(Object o)
Compares the specified object with this list for equality. Returns true if 
and only if the specified object is also a list, both lists have the same size, 
and *all corresponding pairs of elements in the two lists are equal* [No 
reference for Hashcode - DIGY]. (Two elements e1 and e2 are equal if (e1==null 
? e2==null : e1.equals(e2)).) In other words, two lists are defined to be equal 
if they contain the same elements in the same order. This definition ensures 
that the equals method works properly across different implementations of the 
List interface. 


Yes, the sample was from Eric Lippert's blog to show why GetHashCode should not 
be used for comparing objects.
bq. The issue you're describing is more of a problem with the .NET 
implementation of GetHashcode() rather than the correctness of using hashcode 
for comparison. 
No, the problem is not in the implementation of GetHashCode. In any 
implementation, you may have some unexpected collisions(since it is a 4-byte 
number). GetHashCode isn't meant for uniqueness or object identification. It's  
meant to provide a random distribution.
Therefore the problem really lies in using it for equality comparison.

DIGY




 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067366#comment-13067366
 ] 

Digy commented on LUCENENET-437:


See even with worst implementation, Equals method should work.

{code}
///
private void Form1_Load(object sender, EventArgs e)
{
Hashtable h = new Hashtable();

MyClass m1 = new MyClass() { I = 1 };
MyClass m2 = new MyClass() { I = 2 };

h.Add(m1,1);
h.Add(m2,2);

System.Diagnostics.Debug.Assert(h[m2].Equals(2));
}

public class MyClass
{
public int I;
public override int GetHashCode()
{
return 1;
}
}
{code}

 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067375#comment-13067375
 ] 

Digy commented on LUCENENET-437:


bq. This ensures that list1.equals(list2) implies that 
list1.hashCode()==list2.hashCode() for any two lists, list1 and list2, as 
required by the general contract of Object.hashCode.
but it doesn't ensure that 
if list1.hashCode()==list2.hashCode() then list1.equals(list2) should be true, 
as I showed using Eric Lippert's sample.




 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067389#comment-13067389
 ] 

Digy commented on LUCENENET-437:


bq. It ensures equality, but does not ensure inequality.
Sorry but I must object again.
It ensures inequality, but doesn't ensure equality.



 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-17 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-437:
---

Affects Version/s: Lucene.Net 2.9.4g
Fix Version/s: Lucene.Net 2.9.4g

 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-17 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066717#comment-13066717
 ] 

Digy commented on LUCENENET-437:


Using HashCode for equality comparison is not a good idea.

{code}
///
ListComparerstring comp = new ListComparerstring();
b = comp.Equals(new Liststring() { \uA0A2\uA0A2 }, new 
Liststring() {  });
System.Diagnostics.Debug.Assert(b == false);
b = comp.Equals(new Liststring() { \uA0A2\uA0A2 }, new 
Liststring() { \uA0A2\uA0A2\uA0A2\uA0A2 });
System.Diagnostics.Debug.Assert(b == false);

b = new Lucene.Net.Support.EquatableListstring(){\uA0A2\uA0A2 
}.Equals(new Lucene.Net.Support.EquatableListstring() {});
System.Diagnostics.Debug.Assert(b == false);
new Lucene.Net.Support.EquatableListstring() { \uA0A2\uA0A2 
}.Equals(new Lucene.Net.Support.EquatableListstring() { 
\uA0A2\uA0A2\uA0A2\uA0A2});
System.Diagnostics.Debug.Assert(b == false);

///

{code}


DIGY

 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Reopened] (LUCENENET-437) Port Contrib.Shingle from Java

2011-07-17 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy reopened LUCENENET-437:



 Port Contrib.Shingle from Java
 --

 Key: LUCENENET-437
 URL: https://issues.apache.org/jira/browse/LUCENENET-437
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Troy Howard
Assignee: Troy Howard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Port Contrib.Shingle from Java

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-434) Remove AnonymousXXXX classes to increase readablity

2011-07-09 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062351#comment-13062351
 ] 

Digy commented on LUCENENET-434:


very nice.

 Remove Anonymous classes to increase readablity
 ---

 Key: LUCENENET-434
 URL: https://issues.apache.org/jira/browse/LUCENENET-434
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.4g
Reporter: Scott Lombard
Assignee: Scott Lombard
 Fix For: Lucene.Net 2.9.4g

 Attachments: TeeSinkTokenFilter.patch

   Original Estimate: 168h
  Time Spent: 5h
  Remaining Estimate: 163h

 Replace Anonymous classes inhereted from JLCA which make the code 
 impossible to read.
 Follow Digy's template to replace the single abstract method with Func or 
 Action
  
 like in FilterCacheT from:
 protected abstract object MergeDeletes(IndexReader reader, object value);
 to:
 FuncIndexReader, object, object MergeDeletes;
  
 Determine a solution to the classes with more than 1 abstract method without 
 diverging much from Java.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-432) Concurrency issues in SegmentInfo.Files() (LUCENE-2584)

2011-07-07 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-432.


   Resolution: Fixed
Fix Version/s: Lucene.Net 2.9.4
   Lucene.Net 2.9.2
 Assignee: Digy

Patch committed to trunk  2.9.4g branch

 Concurrency issues in SegmentInfo.Files() (LUCENE-2584)
 ---

 Key: LUCENENET-432
 URL: https://issues.apache.org/jira/browse/LUCENENET-432
 Project: Lucene.Net
  Issue Type: Bug
Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Digy
Assignee: Digy
 Fix For: Lucene.Net 2.9.2, Lucene.Net 2.9.4

 Attachments: SegmentInfo.patch


 The multi-threaded call of the files() in SegmentInfo could lead to the 
 ConcurrentModificationException if one thread is not finished additions to 
 the ArrayList (files) yet while the other thread already obtained it as 
 cached.
 https://issues.apache.org/jira/browse/LUCENE-2584

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-430) Contrib.ChainedFilter

2011-07-07 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-430.


Resolution: Fixed

Instead of creating a small project, I put it into Contrib.Analyzers.

 Contrib.ChainedFilter
 -

 Key: LUCENENET-430
 URL: https://issues.apache.org/jira/browse/LUCENENET-430
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g

 Attachments: ChainedFilter.cs, ChainedFilterTest.cs


 Port of lucene.Java 3.0.3's ChainedFilter  its test cases.
 See the StackOverflow question: How to combine multiple filters within one 
 search?
 http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-418) LuceneTestCase should not have a static method could throw exceptions.

2011-07-07 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061125#comment-13061125
 ] 

Digy commented on LUCENENET-418:


It works! Thanks.
DIGY

 LuceneTestCase should not have a static method could throw exceptions.  
 

 Key: LUCENENET-418
 URL: https://issues.apache.org/jira/browse/LUCENENET-418
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Test
Affects Versions: Lucene.Net 3.x
 Environment: Linux, OSX, etc 
Reporter: michael herndon
Assignee: michael herndon
  Labels: test
 Fix For: Lucene.Net 2.9.4g

   Original Estimate: 2m
  Remaining Estimate: 2m

 Throwing an exception in a base classes for 90% tests in a static method 
 makes it hard to debug the issue in nunit.
 The test results came back saying that TestFixtureSetup was causing an issue 
 even though it was the Static Constructor causing problems and this then 
 propagates to all the tests that stem from LuceneTestCase. 
 The TEMP_DIR needs to be moved to a static util class as a property or even a 
 mixin method.  This caused me hours to debug and figure out the real issue as 
 the underlying exception method never bubbled up.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Created] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)

2011-07-07 Thread Digy (JIRA)
AttributeSource can have an invalid computed state (LUCENE-3042)


 Key: LUCENENET-433
 URL: https://issues.apache.org/jira/browse/LUCENENET-433
 Project: Lucene.Net
  Issue Type: Bug
Reporter: Digy
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


If you work a tokenstream, consume it, then reuse it and add an attribute to 
it, the computed state is wrong.
thus for example, clearAttributes() will not actually clear the attribute added.
So in some situations, addAttribute is not actually clearing the computed state 
when it should.

https://issues.apache.org/jira/browse/LUCENE-3042


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)

2011-07-07 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061214#comment-13061214
 ] 

Digy commented on LUCENENET-433:


Here is the test case
{code}
[Test]
public void Test_LUCENE_3042_LUCENENET_433()
{
String testString = t;

Analyzer analyzer = new 
Lucene.Net.Analysis.Standard.StandardAnalyzer();
TokenStream stream = analyzer.ReusableTokenStream(dummy, new 
System.IO.StringReader(testString));
stream.Reset();
while (stream.IncrementToken())
{
// consume
}
stream.End();
stream.Close();

AssertAnalyzesToReuse(analyzer, testString, new String[] { t });
}
{code}

 AttributeSource can have an invalid computed state (LUCENE-3042)
 

 Key: LUCENENET-433
 URL: https://issues.apache.org/jira/browse/LUCENENET-433
 Project: Lucene.Net
  Issue Type: Bug
Reporter: Digy
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 If you work a tokenstream, consume it, then reuse it and add an attribute to 
 it, the computed state is wrong.
 thus for example, clearAttributes() will not actually clear the attribute 
 added.
 So in some situations, addAttribute is not actually clearing the computed 
 state when it should.
 https://issues.apache.org/jira/browse/LUCENE-3042

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass

2011-07-07 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-172.


Resolution: Fixed
  Assignee: Digy  (was: Scott Lombard)

Fixed in 2.9.4g. No fix for 2.9.4


 This patch fixes the unexceptional exceptions ecountered in FastCharStream 
 and SupportClass
 ---

 Key: LUCENENET-172
 URL: https://issues.apache.org/jira/browse/LUCENENET-172
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2
Reporter: Ben Martz
Assignee: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: lucene_2.3.1_exceptions_fix.patch, 
 lucene_2.9.4g_exceptions_fix


 The java version of Lucene handles end-of-file in FastCharStream by throwing 
 an exception. This behavior has been ported to .NET but the behavior carries 
 an unacceptable cost in the .NET environment. This patch is based on the 
 prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge 
 for the solution. While I understand that this patch is outside of the 
 current project specification in that it deviates from the pure nature of 
 the port, I believe that it is very important to make the patch available to 
 any developer looking to leverage Lucene.Net in their project. Thanks for 
 your consideration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)

2011-07-07 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061304#comment-13061304
 ] 

Digy commented on LUCENENET-433:


Committed to 2.9.4g branch

 AttributeSource can have an invalid computed state (LUCENE-3042)
 

 Key: LUCENENET-433
 URL: https://issues.apache.org/jira/browse/LUCENENET-433
 Project: Lucene.Net
  Issue Type: Bug
Reporter: Digy
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: LUCENENET-433.patch


 If you work a tokenstream, consume it, then reuse it and add an attribute to 
 it, the computed state is wrong.
 thus for example, clearAttributes() will not actually clear the attribute 
 added.
 So in some situations, addAttribute is not actually clearing the computed 
 state when it should.
 https://issues.apache.org/jira/browse/LUCENE-3042

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass

2011-07-07 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061595#comment-13061595
 ] 

Digy commented on LUCENENET-172:


Already fixed for 2.9.4g

 This patch fixes the unexceptional exceptions ecountered in FastCharStream 
 and SupportClass
 ---

 Key: LUCENENET-172
 URL: https://issues.apache.org/jira/browse/LUCENENET-172
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2
Reporter: Ben Martz
Assignee: Scott Lombard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: lucene_2.3.1_exceptions_fix.patch, 
 lucene_2.9.4g_exceptions_fix


 The java version of Lucene handles end-of-file in FastCharStream by throwing 
 an exception. This behavior has been ported to .NET but the behavior carries 
 an unacceptable cost in the .NET environment. This patch is based on the 
 prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge 
 for the solution. While I understand that this patch is outside of the 
 current project specification in that it deviates from the pure nature of 
 the port, I believe that it is very important to make the patch available to 
 any developer looking to leverage Lucene.Net in their project. Thanks for 
 your consideration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass

2011-07-07 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-172:
---

Fix Version/s: Lucene.Net 2.9.4g

 This patch fixes the unexceptional exceptions ecountered in FastCharStream 
 and SupportClass
 ---

 Key: LUCENENET-172
 URL: https://issues.apache.org/jira/browse/LUCENENET-172
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2
Reporter: Ben Martz
Assignee: Scott Lombard
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: lucene_2.3.1_exceptions_fix.patch, 
 lucene_2.9.4g_exceptions_fix


 The java version of Lucene handles end-of-file in FastCharStream by throwing 
 an exception. This behavior has been ported to .NET but the behavior carries 
 an unacceptable cost in the .NET environment. This patch is based on the 
 prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge 
 for the solution. While I understand that this patch is outside of the 
 current project specification in that it deviates from the pure nature of 
 the port, I believe that it is very important to make the patch available to 
 any developer looking to leverage Lucene.Net in their project. Thanks for 
 your consideration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-431) Spatial.Net Cartesian won't find docs in radius in certain cases

2011-07-06 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13060709#comment-13060709
 ] 

Digy commented on LUCENENET-431:


Thanks Olle and Matt,

I committed the LUCENE-1930 patch to the 2.9.4g branch (+ added Olle's test 
case).

(Another divergence from lucene.java; since this patch is still waiting to be 
applied).

DIGY

 Spatial.Net Cartesian won't find docs in radius in certain cases
 

 Key: LUCENENET-431
 URL: https://issues.apache.org/jira/browse/LUCENENET-431
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 2.9.4
 Environment: Windows 7 x64
Reporter: Olle Jacobsen
  Labels: spatialsearch

 To replicate change Lucene.Net.Contrib.Spatial.Test.TestCartesian to the 
 following witch should return 3 results.
 Line
 42: private double _lat = 55.6880508001;
 43: private double _lng = 13.5871808352; // This passes: 13.6271808352
 73: AddPoint(writer, Within radius, 55.6880508001, 13.5717346673);
 74: AddPoint(writer, Within radius, 55.6821978456, 13.6076183965);
 75: AddPoint(writer, Within radius, 55.673251569, 13.5946697607);
 76: AddPoint(writer, Close but not in radius, 55.8634157297, 13.5497731987);
 77: AddPoint(writer, Faar away, 40.7137578228, -74.0126901936);
 130: const double miles = 5.0;
 156: Console.WriteLine(Distances should be 3  + distances.Count);
 157: Console.WriteLine(Results should be 3  + results);
 159: Assert.AreEqual(3, distances.Count); // fixed a store of only needed 
 distances
 160: Assert.AreEqual(3, results);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-431) Spatial.Net Cartesian won't find docs in radius in certain cases

2011-07-06 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-431.


   Resolution: Fixed
Fix Version/s: Lucene.Net 2.9.4g
 Assignee: Digy

 Spatial.Net Cartesian won't find docs in radius in certain cases
 

 Key: LUCENENET-431
 URL: https://issues.apache.org/jira/browse/LUCENENET-431
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 2.9.4
 Environment: Windows 7 x64
Reporter: Olle Jacobsen
Assignee: Digy
  Labels: spatialsearch
 Fix For: Lucene.Net 2.9.4g


 To replicate change Lucene.Net.Contrib.Spatial.Test.TestCartesian to the 
 following witch should return 3 results.
 Line
 42: private double _lat = 55.6880508001;
 43: private double _lng = 13.5871808352; // This passes: 13.6271808352
 73: AddPoint(writer, Within radius, 55.6880508001, 13.5717346673);
 74: AddPoint(writer, Within radius, 55.6821978456, 13.6076183965);
 75: AddPoint(writer, Within radius, 55.673251569, 13.5946697607);
 76: AddPoint(writer, Close but not in radius, 55.8634157297, 13.5497731987);
 77: AddPoint(writer, Faar away, 40.7137578228, -74.0126901936);
 130: const double miles = 5.0;
 156: Console.WriteLine(Distances should be 3  + distances.Count);
 157: Console.WriteLine(Results should be 3  + results);
 159: Assert.AreEqual(3, distances.Count); // fixed a store of only needed 
 distances
 160: Assert.AreEqual(3, results);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-418) LuceneTestCase should not have a static method could throw exceptions.

2011-07-05 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059955#comment-13059955
 ] 

Digy commented on LUCENENET-418:


It fails in both builds.

 LuceneTestCase should not have a static method could throw exceptions.  
 

 Key: LUCENENET-418
 URL: https://issues.apache.org/jira/browse/LUCENENET-418
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Test
Affects Versions: Lucene.Net 3.x
 Environment: Linux, OSX, etc 
Reporter: michael herndon
Assignee: michael herndon
  Labels: test
 Fix For: Lucene.Net 2.9.4g

   Original Estimate: 2m
  Remaining Estimate: 2m

 Throwing an exception in a base classes for 90% tests in a static method 
 makes it hard to debug the issue in nunit.
 The test results came back saying that TestFixtureSetup was causing an issue 
 even though it was the Static Constructor causing problems and this then 
 propagates to all the tests that stem from LuceneTestCase. 
 The TEMP_DIR needs to be moved to a static util class as a property or even a 
 mixin method.  This caused me hours to debug and figure out the real issue as 
 the underlying exception method never bubbled up.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-430) Contrib.ChainedFilter

2011-07-04 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-430:
---

Attachment: ChainedFilterTest.cs
ChainedFilter.cs

 Contrib.ChainedFilter
 -

 Key: LUCENENET-430
 URL: https://issues.apache.org/jira/browse/LUCENENET-430
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g

 Attachments: ChainedFilter.cs, ChainedFilterTest.cs


 Port of lucene.Java 3.0.3's ChainedFilter  its test cases.
 See the StackOverflow question: How to combine multiple filters within one 
 search?
 http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Created] (LUCENENET-430) Contrib.ChainedFilter

2011-07-04 Thread Digy (JIRA)
Contrib.ChainedFilter
-

 Key: LUCENENET-430
 URL: https://issues.apache.org/jira/browse/LUCENENET-430
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g
 Attachments: ChainedFilter.cs, ChainedFilterTest.cs

Port of lucene.Java 3.0.3's ChainedFilter  its test cases.

See the StackOverflow question: How to combine multiple filters within one 
search?
http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Closed] (LUCENENET-428) How to do that the results are displayed in the first original tokens and them with synonyms?

2011-06-29 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy closed LUCENENET-428.
--

Resolution: Invalid

Please post questions to the mailing list, not in JIRA

 How to do that the results are displayed in the first original tokens and 
 them with synonyms?
 -

 Key: LUCENENET-428
 URL: https://issues.apache.org/jira/browse/LUCENENET-428
 Project: Lucene.Net
  Issue Type: Wish
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.4
 Environment: .net 4.0
Reporter: Vladimir

 How to do that the results are displayed in the first original tokens and 
 them with synonyms?
 My Analyzer(part) :
 public override TokenStream TokenStream(string fieldName, TextReader reader)
 {
 TokenStream result = new StandardTokenizer(reader);
 result = new LowerCaseFilter(result);
   result = new StopFilter(result, stoptable);
 result = new SynonymFilter(result, synonymEngine); 
 result = new ExtendedRussianStemFilter(result, charset);
 return result;
 }
 My SynonymFilter :
 internal class SynonymFilter : TokenFilter
 {
 private readonly ISynonymEngine engine;
 private readonly QueueToken synonymTokenQueue
 = new QueueToken();
 public SynonymFilter(TokenStream tokenStream, ISynonymEngine engine) 
 : base(tokenStream)
 {
 this.engine = engine;
 }
 public override Token Next()
 {
 if (synonymTokenQueue.Count  0)
 {
 return synonymTokenQueue.Dequeue();
 }
 
 Token t = input.Next();
 
 if (t == null)
 return null;
 if (t.Type() == SYNONYM)
 return t;
 
 IEnumerablestring synonyms = engine.GetSynonyms(t.TermText());
 
 if (synonyms == null)
 {
 return t;
 }
 
 foreach (string syn in synonyms)
 {
 if (!t.TermText().Equals(syn))
 {
 var synToken = new Token(syn, t.StartOffset(),
  t.EndOffset(), SYNONYM);
 
 synToken.SetPositionIncrement(0);
 synonymTokenQueue.Enqueue(synToken);
 }
 }
 return t;
 }
 }
 Thanks!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Created] (LUCENENET-429) Corrupted segment file not detected and wipes index contents (LUCENE-3255)

2011-06-28 Thread Digy (JIRA)
Corrupted segment file not detected and wipes index contents (LUCENE-3255)
--

 Key: LUCENENET-429
 URL: https://issues.apache.org/jira/browse/LUCENENET-429
 Project: Lucene.Net
  Issue Type: New Feature
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g


https://issues.apache.org/jira/browse/LUCENE-3255

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-429) Corrupted segment file not detected and wipes index contents (LUCENE-3255)

2011-06-28 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-429:
---

Attachment: LUCENENET-429.patch

 Corrupted segment file not detected and wipes index contents (LUCENE-3255)
 --

 Key: LUCENENET-429
 URL: https://issues.apache.org/jira/browse/LUCENENET-429
 Project: Lucene.Net
  Issue Type: New Feature
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g

 Attachments: LUCENENET-429.patch


 https://issues.apache.org/jira/browse/LUCENE-3255

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-427) Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234)

2011-06-27 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-427:
---

Attachment: FastVectorHighlighter.patch

 Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234)
 ---

 Key: LUCENENET-427
 URL: https://issues.apache.org/jira/browse/LUCENENET-427
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g

 Attachments: FastVectorHighlighter.patch


 https://issues.apache.org/jira/browse/LUCENE-3234

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-427) Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234)

2011-06-27 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-427.


Resolution: Fixed

Committed 

 Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234)
 ---

 Key: LUCENENET-427
 URL: https://issues.apache.org/jira/browse/LUCENENET-427
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4g

 Attachments: FastVectorHighlighter.patch


 https://issues.apache.org/jira/browse/LUCENE-3234

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (LUCENE-3234) Provide limit on phrase analysis in FastVectorHighlighter

2011-06-25 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054935#comment-13054935
 ] 

Digy commented on LUCENE-3234:
--

I am not sure how much it is related to this issue but there was
a similar issue in Lucene.Net.
https://issues.apache.org/jira/browse/LUCENENET-350



 Provide limit on phrase analysis in FastVectorHighlighter
 -

 Key: LUCENE-3234
 URL: https://issues.apache.org/jira/browse/LUCENE-3234
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 2.9.4, 3.0.3, 3.1, 3.2, 3.3
Reporter: Mike Sokolov
Assignee: Koji Sekiguchi
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3234.patch, LUCENE-3234.patch, LUCENE-3234.patch


 With larger documents, FVH can spend a lot of time trying to find the 
 best-scoring snippet as it examines every possible phrase formed from 
 matching terms in the document.  If one is willing to accept
 less-than-perfect scoring by limiting the number of phrases that are 
 examined, substantial speedups are possible.  This is analogous to the 
 Highlighter limit on the number of characters to analyze.
 The patch includes an artifical test case that shows  1000x speedup.  In a 
 more normal test environment, with English documents and random queries, I am 
 seeing speedups of around 3-10x when setting phraseLimit=1, which has the 
 effect of selecting the first possible snippet in the document.  Most of our 
 sites operate in this way (just show the first snippet), so this would be a 
 big win for us.
 With phraseLimit = -1, you get the existing FVH behavior. At larger values of 
 phraseLimit, you may not get substantial speedup in the normal case, but you 
 do get the benefit of protection against blow-up in pathological cases.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Lucene.Net] [jira] [Commented] (LUCENENET-426) Mark BaseFragmentsBuilder methods as virtual

2011-06-23 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053722#comment-13053722
 ] 

Digy commented on LUCENENET-426:


10 min. work done.
DIGY

 Mark BaseFragmentsBuilder methods as virtual
 

 Key: LUCENENET-426
 URL: https://issues.apache.org/jira/browse/LUCENENET-426
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x, 
 Lucene.Net 2.9.4g
Reporter: Itamar Syn-Hershko
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: fvh.patch


 Without marking methods in BaseFragmentsBuilder as virtual, it is meaningless 
 to have FragmentsBuilder deriving from a class named Base, since most of 
 its functionality cannot be overridden. Attached is a patch for marking the 
 important methods virtual.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Resolved] (LUCENENET-426) Mark BaseFragmentsBuilder methods as virtual

2011-06-16 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-426.


   Resolution: Fixed
Fix Version/s: Lucene.Net 2.9.4g
   Lucene.Net 2.9.4

Thanks Itamar.
Fixed in trunk  2.9.4g branch.

DIGY

 Mark BaseFragmentsBuilder methods as virtual
 

 Key: LUCENENET-426
 URL: https://issues.apache.org/jira/browse/LUCENENET-426
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x, 
 Lucene.Net 2.9.4g
Reporter: Itamar Syn-Hershko
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: fvh.patch


 Without marking methods in BaseFragmentsBuilder as virtual, it is meaningless 
 to have FragmentsBuilder deriving from a class named Base, since most of 
 its functionality cannot be overridden. Attached is a patch for marking the 
 important methods virtual.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-417) implement streams as field values

2011-06-15 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049937#comment-13049937
 ] 

Digy commented on LUCENENET-417:


Maybe something like this

doc.Add(new Field(name,-
doc.Add(new Field(metadata,-
doc.Add(new Field(content,part1-
doc.Add(new Field(content,part2-

doc.Add(new Field(content,partN-


DIGY

 implement streams as field values
 -

 Key: LUCENENET-417
 URL: https://issues.apache.org/jira/browse/LUCENENET-417
 Project: Lucene.Net
  Issue Type: New Feature
  Components: Lucene.Net Core
Reporter: Christopher Currens
 Attachments: StreamValues.patch


 Adding binary values to a field is an expensive operation, as the whole 
 binary data must be loaded into memory and then written to the index.  Adding 
 the ability to use a stream instead of a byte array could not only speed up 
 the indexing process, but reducing the memory footprint as well.
 -Java lucene has the ability to use a TextReader the both analyze and store 
 text in the index.-  Lucene.NET lacks the ability to store string data in the 
 index via streams. This should be a feature added into Lucene .NET as well.  
 My thoughts are to add another Field constructor, that is Field(string name, 
 System.IO.Stream stream, System.Text.Encoding encoding), that will allow the 
 text to be analyzed and stored into the index.
 Comments about this approach are greatly appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-425) MMapDirectory implementation

2011-06-14 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049322#comment-13049322
 ] 

Digy commented on LUCENENET-425:


I would like to compare search speeds of FSDirectory and MMapDirectory (if 
possible, with a big index)

1- IndexReader.Open( FSDirectory.Open(new System.IO.FileInfo(INDEX))  ,true );
2- IndexReader.Open( new MMapDirectory(new System.IO.FileInfo(INDEX)) ,true );

Something like
{code}
public class TestSearchSpeed
{
string INDEX = @Path to Index;  //
string[] _words = new string[]{}; //Some 
words to search
Directory _dir;

public long TestFSDir()
{
_dir = FSDirectory.Open(new System.IO.FileInfo(INDEX));
return Test();

}

public long TestMMapDir()
{
_dir = new MMapDirectory(new System.IO.FileInfo(INDEX));
return Test();
}

long Test()
{
IndexReader reader = IndexReader.Open(_dir, true);
Search(reader, sometext);

Stopwatch sw = new Stopwatch();
sw.Start();

for (int i = 0; i  5; i++)
{
Parallel.For(0, 50, j =
{
Search(reader, _words[j % _words.Length]);
}
);
}

long dur = sw.ElapsedMilliseconds;
sw.Stop();

reader.Close();

return dur;
}

void Search(IndexReader reader,string criteria)
{
IndexSearcher src = new IndexSearcher(reader);
Query q = new QueryParser(field, new 
WhitespaceAnalyzer()).Parse(criteria);
TopDocs hits =  src.Search(q, 100);
for (int i = 0; i  hits.ScoreDocs.Length; i++)
{
Document doc = reader.Document(hits.ScoreDocs[i].doc);
string s = doc.GetField(field).StringValue();
}
}
}
{code}

DIGY


 MMapDirectory implementation
 

 Key: LUCENENET-425
 URL: https://issues.apache.org/jira/browse/LUCENENET-425
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4g

 Attachments: MMapDirectory.patch


 Since this is not a direct port of MMapDirectory.java, I'll put it under 
 Support and implement MMapDirectory as 
 {code}
 public class MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory
 {
 }
 {code}
 If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 
 bit address range), it will default to FSDirectory.FSIndexInput
 In my tests, I didn't see any performance gain in 32bit environment and I 
 consider it as better then nothing. 
 I would be happy if someone could send test results on 64bit platform.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-423) QueryParser differences between Java and .NET

2011-06-14 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049377#comment-13049377
 ] 

Digy commented on LUCENENET-423:


I don't think there is an inconsistency between the Java version and .NET.
If you know that the field is indexed as date, then you should give your 
date-string (while searching) in the form the language can parse.
(And both languages UIs return datetime string parseble by other libraries. It 
is not common that the user types the datetime string in a textbox)

DIGY

 QueryParser differences between Java and .NET
 -

 Key: LUCENENET-423
 URL: https://issues.apache.org/jira/browse/LUCENENET-423
 Project: Lucene.Net
  Issue Type: Bug
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Christopher Currens

 When trying to do a RangeQuery that uses dates in a certain format, .NET 
 behaves differently from its Java counterpart.  The code is the same between 
 them, but as far as I can tell, it appears that it is a difference in the way 
 Java parses dates vs how .NET parses dates.  To reproduce:
 {code:java}
 var queryParser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, 
 FullText, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
 var query = queryParser.Parse(Field:[2001-01-17 TO 2001-01-20]);
 {code}
 You'll notice that query looks like the old DateField format (eg 
 0g1d64542).  If you do the same query in Java (or Luke), you'll notice the 
 query gets parsed as if it were a RangeQuery of string.  AFAIK, Java cannot 
 parse a string formatted in that way.  If you change the string to use / 
 instead of - in the java, you'll get one that uses DateResolutions and 
 DateTools.DateToString().
 It seems an appropriate fix for this, if we wanted to keep this behavior 
 similar to Java, would be to write our own DateTime parser that behaved the 
 same way to Java's parser.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-417) implement streams as field values

2011-06-14 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049350#comment-13049350
 ] 

Digy commented on LUCENENET-417:


Maybe, this is a stupid question but, what is the reason to index a very large 
doc?
If I indexed a whole book as single document, It would appear in almost every 
kind of search's result sets.
search computer -- this book.
search sport  -- this book.
search politics -- this book.

DIGY

 implement streams as field values
 -

 Key: LUCENENET-417
 URL: https://issues.apache.org/jira/browse/LUCENENET-417
 Project: Lucene.Net
  Issue Type: New Feature
  Components: Lucene.Net Core
Reporter: Christopher Currens
 Attachments: StreamValues.patch


 Adding binary values to a field is an expensive operation, as the whole 
 binary data must be loaded into memory and then written to the index.  Adding 
 the ability to use a stream instead of a byte array could not only speed up 
 the indexing process, but reducing the memory footprint as well.
 -Java lucene has the ability to use a TextReader the both analyze and store 
 text in the index.-  Lucene.NET lacks the ability to store string data in the 
 index via streams. This should be a feature added into Lucene .NET as well.  
 My thoughts are to add another Field constructor, that is Field(string name, 
 System.IO.Stream stream, System.Text.Encoding encoding), that will allow the 
 text to be analyzed and stored into the index.
 Comments about this approach are greatly appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Commented] (LUCENENET-425) MMapDirectory implementation

2011-06-14 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049473#comment-13049473
 ] 

Digy commented on LUCENENET-425:


OK, I think it will be better to mark MMapDirectory as unimplemented like 
NIOFSDirectory.

DIGY

 MMapDirectory implementation
 

 Key: LUCENENET-425
 URL: https://issues.apache.org/jira/browse/LUCENENET-425
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4g

 Attachments: MMapDirectory.patch


 Since this is not a direct port of MMapDirectory.java, I'll put it under 
 Support and implement MMapDirectory as 
 {code}
 public class MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory
 {
 }
 {code}
 If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 
 bit address range), it will default to FSDirectory.FSIndexInput
 In my tests, I didn't see any performance gain in 32bit environment and I 
 consider it as better then nothing. 
 I would be happy if someone could send test results on 64bit platform.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Closed] (LUCENENET-425) MMapDirectory implementation

2011-06-14 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy closed LUCENENET-425.
--

Resolution: Won't Fix

 MMapDirectory implementation
 

 Key: LUCENENET-425
 URL: https://issues.apache.org/jira/browse/LUCENENET-425
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4g

 Attachments: MMapDirectory.patch


 Since this is not a direct port of MMapDirectory.java, I'll put it under 
 Support and implement MMapDirectory as 
 {code}
 public class MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory
 {
 }
 {code}
 If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 
 bit address range), it will default to FSDirectory.FSIndexInput
 In my tests, I didn't see any performance gain in 32bit environment and I 
 consider it as better then nothing. 
 I would be happy if someone could send test results on 64bit platform.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-425) MMapDirectory implementation

2011-06-13 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-425:
---

Description: 
Since this is not a direct port of MMapDirectory.java, I'll put it under 
Support and implement MMapDirectory as 
{code}
public class MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory
{
}
{code}
If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 bit 
address range), it will default to FSDirectory.FSIndexInput

In my tests, I didn't see any performance gain in 32bit environment and I 
consider it as better then nothing. 

I would be happy if someone could send test results on 64bit platform.

DIGY

  was:
Since this is not a direct port of MMapDirectory.java, I'll put it under 
Support and implement MMapDirectory as 
{code}
public class 
MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory:Lucene.Net.Support.MemoryMappedDirectory
{
}
{code}
If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 bit 
address range), it will default to FSDirectory.FSIndexInput

In my tests, I didn't see any performance gain in 32bit environment and I 
consider it as better then nothing. 

I would be happy if someone could send test results on 64bit platform.

DIGY


 MMapDirectory implementation
 

 Key: LUCENENET-425
 URL: https://issues.apache.org/jira/browse/LUCENENET-425
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4g

 Attachments: MMapDirectory.patch


 Since this is not a direct port of MMapDirectory.java, I'll put it under 
 Support and implement MMapDirectory as 
 {code}
 public class MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory
 {
 }
 {code}
 If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 
 bit address range), it will default to FSDirectory.FSIndexInput
 In my tests, I didn't see any performance gain in 32bit environment and I 
 consider it as better then nothing. 
 I would be happy if someone could send test results on 64bit platform.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-425) MMapDirectory implementation

2011-06-13 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-425:
---

Attachment: MMapDirectory.patch

 MMapDirectory implementation
 

 Key: LUCENENET-425
 URL: https://issues.apache.org/jira/browse/LUCENENET-425
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4g

 Attachments: MMapDirectory.patch


 Since this is not a direct port of MMapDirectory.java, I'll put it under 
 Support and implement MMapDirectory as 
 {code}
 public class 
 MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory:Lucene.Net.Support.MemoryMappedDirectory
 {
 }
 {code}
 If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 
 bit address range), it will default to FSDirectory.FSIndexInput
 In my tests, I didn't see any performance gain in 32bit environment and I 
 consider it as better then nothing. 
 I would be happy if someone could send test results on 64bit platform.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Created] (LUCENENET-425) MMapDirectory implementation

2011-06-13 Thread Digy (JIRA)
MMapDirectory implementation


 Key: LUCENENET-425
 URL: https://issues.apache.org/jira/browse/LUCENENET-425
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4g
 Attachments: MMapDirectory.patch

Since this is not a direct port of MMapDirectory.java, I'll put it under 
Support and implement MMapDirectory as 
{code}
public class 
MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory:Lucene.Net.Support.MemoryMappedDirectory
{
}
{code}
If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 bit 
address range), it will default to FSDirectory.FSIndexInput

In my tests, I didn't see any performance gain in 32bit environment and I 
consider it as better then nothing. 

I would be happy if someone could send test results on 64bit platform.

DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Lucene.Net] [jira] [Updated] (LUCENENET-424) IsolatedStorage Support for Windows Phone 7

2011-06-10 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-424:
---

Attachment: TestIsolatedStorageDirectory.cs

Test cases for IsolatedStorageDirectory.

(Doesn't IsolatedStorageDirectory have a public constructor?)

 IsolatedStorage Support for Windows Phone 7
 ---

 Key: LUCENENET-424
 URL: https://issues.apache.org/jira/browse/LUCENENET-424
 Project: Lucene.Net
  Issue Type: Task
  Components: Lucene.Net Contrib, Lucene.Net Test
Reporter: Prescott Nasser
Assignee: Prescott Nasser
Priority: Minor
  Labels: wp7
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: TestIsolatedStorageDirectory.cs


 Create IsolatedStorage Store to support windows phone 7 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-423) QueryParser differences between Java and .NET

2011-06-10 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047089#comment-13047089
 ] 

Digy commented on LUCENENET-423:


Maybe I am missing something,
but I run your code both in .NET  Java(not Luke) and printed query.ToString().
Same Result(in base36).

DIGY

 QueryParser differences between Java and .NET
 -

 Key: LUCENENET-423
 URL: https://issues.apache.org/jira/browse/LUCENENET-423
 Project: Lucene.Net
  Issue Type: Bug
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Christopher Currens

 When trying to do a RangeQuery that uses dates in a certain format, .NET 
 behaves differently from its Java counterpart.  The code is the same between 
 them, but as far as I can tell, it appears that it is a difference in the way 
 Java parses dates vs how .NET parses dates.  To reproduce:
 {code:java}
 var queryParser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, 
 FullText, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
 var query = queryParser.Parse(Field:[2001-01-17 TO 2001-01-20]);
 {code}
 You'll notice that query looks like the old DateField format (eg 
 0g1d64542).  If you do the same query in Java (or Luke), you'll notice the 
 query gets parsed as if it were a RangeQuery of string.  AFAIK, Java cannot 
 parse a string formatted in that way.  If you change the string to use / 
 instead of - in the java, you'll get one that uses DateResolutions and 
 DateTools.DateToString().
 It seems an appropriate fix for this, if we wanted to keep this behavior 
 similar to Java, would be to write our own DateTime parser that behaved the 
 same way to Java's parser.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-423) QueryParser differences between Java and .NET

2011-06-10 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047179#comment-13047179
 ] 

Digy commented on LUCENENET-423:


You are right, I used a different date string.

.Net seems to parse the date-strings better.
I would leave it as is.

DIGY

 QueryParser differences between Java and .NET
 -

 Key: LUCENENET-423
 URL: https://issues.apache.org/jira/browse/LUCENENET-423
 Project: Lucene.Net
  Issue Type: Bug
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g
Reporter: Christopher Currens

 When trying to do a RangeQuery that uses dates in a certain format, .NET 
 behaves differently from its Java counterpart.  The code is the same between 
 them, but as far as I can tell, it appears that it is a difference in the way 
 Java parses dates vs how .NET parses dates.  To reproduce:
 {code:java}
 var queryParser = new QueryParser(Lucene.Net.Util.Version.LUCENE_29, 
 FullText, new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_29));
 var query = queryParser.Parse(Field:[2001-01-17 TO 2001-01-20]);
 {code}
 You'll notice that query looks like the old DateField format (eg 
 0g1d64542).  If you do the same query in Java (or Luke), you'll notice the 
 query gets parsed as if it were a RangeQuery of string.  AFAIK, Java cannot 
 parse a string formatted in that way.  If you change the string to use / 
 instead of - in the java, you'll get one that uses DateResolutions and 
 DateTools.DateToString().
 It seems an appropriate fix for this, if we wanted to keep this behavior 
 similar to Java, would be to write our own DateTime parser that behaved the 
 same way to Java's parser.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Closed] (LUCENENET-421) Segment files ocasionaly disappearing making index corrupted

2011-06-09 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy closed LUCENENET-421.
--

Resolution: Invalid

Seems like reporter isn't interested any more in this issue.
DIGY

 Segment files ocasionaly disappearing making index corrupted
 

 Key: LUCENENET-421
 URL: https://issues.apache.org/jira/browse/LUCENENET-421
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: Media Chase ECF50 in the MastermindToys.com online toy 
 store, IIS 7 under Win 2008 R2, index on RAID 1
Reporter: Fedor Taiakin

 IIS 7 under Win 2008 R2, index located on RAID 1
 The only operations Add Document and Delete Document, optimize = false.
 Ocasionally the segment files disappear, corrupting index. No other 
 exceptions prior to inability to open index:
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'.
  --- System.IO.FileNotFoundException: Could not find file 
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'.
 File name: 
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run()
at Lucene.Net.Index.IndexReader.Open(Directory directory)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-422) Custom tokenizers may fail when indexing due to ReusableStringReader not implement some method of TextReader

2011-06-05 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044524#comment-13044524
 ] 

Digy commented on LUCENENET-422:


percyboy,

Thanks for your bug fix.
I commited the fix to trunk(2.9.4)  to 2.9.4g branch.

PS: None of the TextReader's methods like ReadBlock, ReadLine, Peek, ReadToEnd 
were implemented in ReusableStringReader.
And calling these methods returned only empty strings without giving any info 
to the users.
Therefore I added these NotImplementedExceptions in LUCENENET-150 and 
implemented just ReadToEnd
(the only method I used in my custom analyzer).




DIGY



 Custom tokenizers may fail when indexing due to ReusableStringReader not 
 implement some method of TextReader
 

 Key: LUCENENET-422
 URL: https://issues.apache.org/jira/browse/LUCENENET-422
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: from Lucene 2.3.x to current trunk
Reporter: percyboy
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: ReusableStringReader.cs


 Lucene.Net.Index.ReusableStringReader is inherited from TextReader, but marks 
 some methods as Not Implemented.
 Some custom tokenizers who call these unfinished methods will meet an error.
 It is, somewhat, like a trap.
 LUCENENET-150 is a similar issue to this one.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Resolved] (LUCENENET-422) Custom tokenizers may fail when indexing due to ReusableStringReader not implement some method of TextReader

2011-06-05 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-422.


Resolution: Fixed

 Custom tokenizers may fail when indexing due to ReusableStringReader not 
 implement some method of TextReader
 

 Key: LUCENENET-422
 URL: https://issues.apache.org/jira/browse/LUCENENET-422
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: from Lucene 2.3.x to current trunk
Reporter: percyboy
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: ReusableStringReader.cs


 Lucene.Net.Index.ReusableStringReader is inherited from TextReader, but marks 
 some methods as Not Implemented.
 Some custom tokenizers who call these unfinished methods will meet an error.
 It is, somewhat, like a trap.
 LUCENENET-150 is a similar issue to this one.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-414) The definition of CharArraySet is dangerously confusing and leads to bugs when used.

2011-06-04 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-414:
---

Fix Version/s: (was: Lucene.Net 2.9.2)
   Lucene.Net 2.9.4g
   Lucene.Net 2.9.4

 The definition of CharArraySet is dangerously confusing and leads to bugs 
 when used.
 

 Key: LUCENENET-414
 URL: https://issues.apache.org/jira/browse/LUCENENET-414
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: Irrelevant
Reporter: Vincent Van Den Berghe
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g


 Right now, CharArraySet derives from System.Collections.Hashtable, but 
 doesn't actually use this base type for storing elements.
 However, the StandardAnalyzer.STOP_WORDS_SET is exposed as a 
 System.Collections.Hashtable. The trivial code to build your own stopword set 
 using the StandardAnalyzer.STOP_WORDS_SET and adding your own set of 
 stopwords like this:
 CharArraySet myStopWords = new CharArraySet(StandardAnalyzer.STOP_WORDS_SET, 
 ignoreCase: false);
 foreach (string domainSpecificStopWord in DomainSpecificStopWords)
 stopWords.Add(domainSpecificStopWord);
 ... will fail because the CharArraySet accepts an ICollection, which will be 
 passed the Hashtable instance of STOP_WORDS_SET: the resulting myStopWords 
 will only contain the DomainSpecificStopWords, and not those from 
 STOP_WORDS_SET.
 One workaround would be to replace the first line with this:
 CharArraySet stopWords = new 
 CharArraySet(StandardAnalyzer.STOP_WORDS_SET.Count + 
 DomainSpecificStopWords.Length, ignoreCase: false);
 foreach (string domainSpecificStopWord in 
 (CharArraySet)StandardAnalyzer.STOP_WORDS_SET)
 stopWords.Add(domainSpecificStopWord);
 ... but this makes use of the implementation detail (the STOP_WORDS_SET is 
 really an UnmodifiableCharArraySet which is itself a CharArraySet). It works 
 because it forces the foreach() to use the correct 
 CharArraySet.GetEnumerator(), which is defined as a new method (this has a 
 bad code smell to it)
 At least 2 possibilities exist to solve this problem:
 - Make CharArraySet use the Hashtable instance and a custom comparator, 
 instead of its own implementation.
 - Make CharArraySet use HashSetchar[], defined in .NET 4.0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-422) Custom tokenizers may fail when indexing due to ReusableStringReader not implement some method of TextReader

2011-06-04 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-422:
---

Fix Version/s: Lucene.Net 2.9.4g
   Lucene.Net 2.9.4

 Custom tokenizers may fail when indexing due to ReusableStringReader not 
 implement some method of TextReader
 

 Key: LUCENENET-422
 URL: https://issues.apache.org/jira/browse/LUCENENET-422
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: from Lucene 2.3.x to current trunk
Reporter: percyboy
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: ReusableStringReader.cs


 Lucene.Net.Index.ReusableStringReader is inherited from TextReader, but marks 
 some methods as Not Implemented.
 Some custom tokenizers who call these unfinished methods will meet an error.
 It is, somewhat, like a trap.
 LUCENENET-150 is a similar issue to this one.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-06-02 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042963#comment-13042963
 ] 

Digy commented on LUCENENET-415:


According to my last tests, SFS searches cost only an additional 60-80 ms 
compared to standard searches(~3GB index, 1M docs, 342 facets).
(Assuming that the same # of documents are read from the index).

Some other features like 
 - Faceting by query (can SFS be named as Faceting by field?)
 - Range faceting (e.g., monthly facets on fields like 20110602) (again 
correct terminology?)
 - Disk cache for large # of BitSets
etc. can be added in the future.
I think this is enough for *Simple*FacetedSearch.

I will commit it to trunk.

DIGY



 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Resolved] (LUCENENET-415) Contrib/Faceted Search

2011-06-02 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-415.


Resolution: Fixed
  Assignee: Digy

Committed to trunk  2.9.4g branch

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Assignee: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-06-02 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043011#comment-13043011
 ] 

Digy commented on LUCENENET-415:


Thanks M.Herndon for this wiki page

https://cwiki.apache.org/confluence/display/LUCENENET/Simple+Faceted+Search

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Assignee: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g

 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-421) Segment files ocasionaly disappearing making index corrupted

2011-05-31 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13041604#comment-13041604
 ] 

Digy commented on LUCENENET-421:


This may happen when two processes/threads access the same index 
*simultaneously* for writing .
IndexWriter doesn't allow it by default but can be bypassed with 
IndexWriter.Unlock.

Also, might there be other processes accessing the index such as virus scanners 
etc.?

DIGY

 Segment files ocasionaly disappearing making index corrupted
 

 Key: LUCENENET-421
 URL: https://issues.apache.org/jira/browse/LUCENENET-421
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: Media Chase ECF50 in the MastermindToys.com online toy 
 store, IIS 7 under Win 2008 R2, index on RAID 1
Reporter: Fedor Taiakin

 IIS 7 under Win 2008 R2, index located on RAID 1
 The only operations Add Document and Delete Document, optimize = false.
 Ocasionally the segment files disappear, corrupting index. No other 
 exceptions prior to inability to open index:
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'.
  --- System.IO.FileNotFoundException: Could not find file 
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'.
 File name: 
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run()
at Lucene.Net.Index.IndexReader.Open(Directory directory)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Closed] (LUCENENET-416) IndexWriter.Init may orphan its write lock in case of exception

2011-05-31 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy closed LUCENENET-416.
--

   Resolution: Not A Problem
Fix Version/s: Lucene.Net 2.9.4

Fixed in 2.9.4  2.9.4g
DIGY

 IndexWriter.Init may orphan its write lock in case of exception
 ---

 Key: LUCENENET-416
 URL: https://issues.apache.org/jira/browse/LUCENENET-416
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: .NET 4
Reporter: HÃ¥kan Lindqvist
 Fix For: Lucene.Net 2.9.4


 In IndexWriter.Init, if an exception other than IOException is thrown after 
 the write lock has been acquired, the lock is not released. (See 
 Index\IndexWriter.cs:1922 for a starting point.)
 Specifically, the exception we have seen occuring is 
 UnauthorizedAccessException, eg Access to the path 'C:\foo\bar\segments.gen' 
 is denied.
 Stack trace from the UnauthorizedAccessException as mentioned above:
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.Init(String path, FileMode mode, FileAccess 
 access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, 
 FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean 
 bFromProxy, Boolean useLongPath)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess 
 access, FileShare share)
at 
 Lucene.Net.Store.SimpleFSDirectory.SimpleFSIndexInput.Descriptor..ctor(FileInfo
  file, FileAccess mode)
at Lucene.Net.Store.SimpleFSDirectory.OpenInput(String name, Int32 
 bufferSize)
at Lucene.Net.Store.FSDirectory.OpenInput(String name)
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit)
at Lucene.Net.Index.SegmentInfos.Read(Directory directory)
at Lucene.Net.Index.IndexWriter.Init(Directory d, Analyzer a, Boolean 
 create, Boolean closeDir, IndexDeletionPolicy deletionPolicy, Boolean 
 autoCommit, Int32 maxFieldLength, IndexingChain indexingChain, IndexCommit 
 commit)
at Lucene.Net.Index.IndexWriter..ctor(Directory d, Analyzer a, Boolean 
 create, MaxFieldLength mfl)
 I do not know under what circumstances that initial exception occurred but 
 after this has happened all subsequent attempts at accessing the index will 
 fail.
 It seems that changing the catch statement to release the writelock 
 regardless of exception type should solve this

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-421) Segment files ocasionaly disappearing making index corrupted

2011-05-31 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13041634#comment-13041634
 ] 

Digy commented on LUCENENET-421:


Could you try 2.9.4 in
https://svn.apache.org/repos/asf/incubator/lucene.net/trunk
or 2.9.4g in
https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_9_4g
 ?

Maybe, it is a bug like LUCENENET-416 fixed in these versions.

DIGY

 Segment files ocasionaly disappearing making index corrupted
 

 Key: LUCENENET-421
 URL: https://issues.apache.org/jira/browse/LUCENENET-421
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: Media Chase ECF50 in the MastermindToys.com online toy 
 store, IIS 7 under Win 2008 R2, index on RAID 1
Reporter: Fedor Taiakin

 IIS 7 under Win 2008 R2, index located on RAID 1
 The only operations Add Document and Delete Document, optimize = false.
 Ocasionally the segment files disappear, corrupting index. No other 
 exceptions prior to inability to open index:
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'.
  --- System.IO.FileNotFoundException: Could not find file 
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'.
 File name: 
 'C:\Projects\MMT\ECF50\main\src\PublicLayer\SearchIndex\eCommerceFramework\CatalogEntryIndexer\_b6k.cfs'
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run()
at Lucene.Net.Index.IndexReader.Open(Directory directory)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-420) String.StartsWith has culture in it.

2011-05-28 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13040701#comment-13040701
 ] 

Digy commented on LUCENENET-420:


And you are using WildcardQuery like sometext* ?
DIGY

 String.StartsWith has culture in it.
 

 Key: LUCENENET-420
 URL: https://issues.apache.org/jira/browse/LUCENENET-420
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x
 Environment: .NET under (at least) da-DK culture
Reporter: Niels Kühnel
 Fix For: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x

   Original Estimate: 4h
  Remaining Estimate: 4h

 I've been hunting a weird bug for a long time. I finally found it's cause.
 I'm Danish, thus my .NET culture is da-DK. In this culture Gaard, doesn't 
 start with Ga because it thinks that aa is å (in Danish it was before 
 1948).
 That gives some unexpected results when doing prefix queries.
 The solution is to add StringComparison.InvariantCulture in all StartsWith 
 comparisons.
 To verify my claim, try running:
 Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo(da-DK);
 Assert.IsFalse(Gaard.StartsWith(Ga));
 Assert.IsTrue(Gaard.StartsWith(Ga, StringComparison.InvariantCulture));
 Cheers,
 Niels Kühnel

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-420) String.StartsWith has culture in it.

2011-05-28 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13040712#comment-13040712
 ] 

Digy commented on LUCENENET-420:


 because it thinks that aa is å
In my test case, Gaard.StartsWith(GÃ¥) also returns false.

I am still not sure, whether it is a Lucene.Net bug, or something that should 
be handled by the user.
I'll think about it.

DIGY

 String.StartsWith has culture in it.
 

 Key: LUCENENET-420
 URL: https://issues.apache.org/jira/browse/LUCENENET-420
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x
 Environment: .NET under (at least) da-DK culture
Reporter: Niels Kühnel
 Fix For: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x

   Original Estimate: 4h
  Remaining Estimate: 4h

 I've been hunting a weird bug for a long time. I finally found it's cause.
 I'm Danish, thus my .NET culture is da-DK. In this culture Gaard, doesn't 
 start with Ga because it thinks that aa is å (in Danish it was before 
 1948).
 That gives some unexpected results when doing prefix queries.
 The solution is to add StringComparison.InvariantCulture in all StartsWith 
 comparisons.
 To verify my claim, try running:
 Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo(da-DK);
 Assert.IsFalse(Gaard.StartsWith(Ga));
 Assert.IsTrue(Gaard.StartsWith(Ga, StringComparison.InvariantCulture));
 Cheers,
 Niels Kühnel

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-25 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch2.cs
SimpleFacetedSearch2.cs

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-05-25 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039259#comment-13039259
 ] 

Digy commented on LUCENENET-415:


With the increasing number of attached files, it is getting hard to trace the 
changes.
I created a contrib project(SimpleFacetedSearch) under 2.9.4g branch

https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_9_4g.

DIGY



 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-05-25 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039272#comment-13039272
 ] 

Digy commented on LUCENENET-415:


Hi Ben,
Do you think we still need IndexSearcher  UseCache?

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-05-25 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039289#comment-13039289
 ] 

Digy commented on LUCENENET-415:


I'll wait a few days before closing this issue  commiting to 2.9.4
Here are the sources:
Source: 
https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_9_4g/src/contrib/SimpleFacetedSearch/
Readme: 
https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_9_4g/src/contrib/SimpleFacetedSearch/README.txt
Test  Usage: 
https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_9_4g/test/contrib/SimpleFacetedSearch

Any comments on class/variable names, APIs etc. since I've never been good in 
them?

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-24 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch2.cs
SimpleFacetedSearch2.cs

I take one step further. Multi-field faceting.
It requires many code cleanups, but works.

  SimpleFacetedSeach2

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-24 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch2.cs
SimpleFacetedSearch2.cs

Some comments.

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
 SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
 TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
 facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-05-23 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038063#comment-13038063
 ] 

Digy commented on LUCENENET-415:


Hi Ben,
Thanks for your comments  test code.

{code}
sfs = new SimpleFacetedSearch(reader, category);
sfs.Search(query) // + fetch
{code}
is roughly equal to
{code}
foreach(cat in GetGroups(category))
{
 
BooleanQuery bq = BooleanQuery();

bg.Add(query , Lucene.Net.Search.BooleanClause.Occur.MUST)
bg.Add(queryParser.Parse(category: + cat) , 
Lucene.Net.Search.BooleanClause.Occur.MUST);

indexSearcher.Search(bg); // + fetch
}
{code}

It would be good to compare these two codes too.

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-05-23 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038228#comment-13038228
 ] 

Digy commented on LUCENENET-415:


But BitSet+Caching is still faster than BooleanQuery, if don't misinterpret 
your numbers.
DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, PerformanceTest.cs, 
 PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, facet performance.xls, facet performance.xls


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-22 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch.cs
SimpleFacetedSearch.cs

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: SimpleFacetedSearch.cs
TestSimpleFacetedSearch.cs

Hi Ben, 
There is a maxItemPerGroup parameter in the constructor. But It will be 
better to move it to search method.

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch.cs
SimpleFacetedSearch.cs

some performance improvements.

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: (was: TestSimpleFacetedSearch.cs)

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: (was: SimpleFacetedSearch.cs)

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: (was: SimpleFacetedSearch.cs)

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: (was: TestSimpleFacetedSearch.cs)

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: (was: TestSimpleFacetedSearch.cs)

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: (was: TestSimpleFacetedSearch.cs)

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: (was: SimpleFacetedSearch.cs)

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-05-21 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037478#comment-13037478
 ] 

Digy commented on LUCENENET-415:


Hi Ben,
About performance test:

- One of the costly ops in this faceted-search is the creation of 
SimpleFacetedSearch. It creates the bit sets for all of the group members. 
Since it should be created only once when a new IndexReader is opened(if some 
documents are added or deleted), its creation time should be excluded from the 
test.
- Another costly op is the fetching data from index. After each search, some 
data should be read and this duration should be included in the test.
Eg. 
{code}
TopDocs hits = sfs.Search(q, 100);
for (int j = 0; j  hits.ScoreDocs.Length; j++)
{
Document doc = reader.Document(hits.ScoreDocs[j].doc);
Fieldable f = doc.GetField(title);
}

SimpleFacetedSearch.Hits hits = sfs.Search(q,maxDocPerGroup);
foreach (var h in hits.HitsPerGroup)
{
 foreach (Document doc in h.Documents)
{
Fieldable f = doc.GetField(title);
}
}
{code}
- Hits is a deprecated class and it repeates the search every N (AFAIK 100) 
document access. It is not a normal search and should be excluded from the 
test.

Thanks,
DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-20 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: SimpleFacetedSearch.cs

Of course, Ben.

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
 TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Created] (LUCENENET-415) Contrib/Faceted Search

2011-05-20 Thread Digy (JIRA)
Contrib/Faceted Search
--

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor


Since I see a lot of questions about faceted search in these days, I plan to 
add a Faceted-Search project to contrib.
DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-20 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch.cs
SimpleFacetedSearch.cs

Just a draft.
Needs your contribution.

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-415) Contrib/Faceted Search

2011-05-20 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13037061#comment-13037061
 ] 

Digy commented on LUCENENET-415:


Here is the documentation of the code:)

{code}
SimpleFacetedSearch sfs = new SimpleFacetedSearch(_Reader, cat);
Query query = new QueryParser(text, new StandardAnalyzer()).Parse(block*);
SimpleFacetedSearch.Hits hits = sfs.Search(query);

long totalHits = hits.TotalHitCount;
foreach (SimpleFacetedSearch.HitsPerGroup hpg in hits.HitsPerGroup)
{
long hitCountPerGroup = hpg.HitCount;
foreach (Document doc in hpg)
{
string text = doc.GetField(text).StringValue();
}
}
{code}

DIGY

 Contrib/Faceted Search
 --

 Key: LUCENENET-415
 URL: https://issues.apache.org/jira/browse/LUCENENET-415
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Attachments: SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs


 Since I see a lot of questions about faceted search in these days, I plan to 
 add a Faceted-Search project to contrib.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-412) Replacing ArrayLists, Hashtables etc. with appropriate Generics.

2011-05-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035795#comment-13035795
 ] 

Digy commented on LUCENENET-412:


Hi All,

Lucene.Net 2.9.4g is almost ready for testing  feedbacks. 

While injecting generics  making some clean up in code, I tried to be close to 
lucene 3.0.3 as much as possible.
Therefore it's position is somewhere between lucene.Java 2.9.4  3.0.3

DIGY


PS: For those who might want to try this version:
It won't probably be a drop-in replacement since there are a few API changes 
like
- StopAnalyzer(Liststring stopWords)
- Query.ExtractTerms(ICollectionstring)
- TopDocs.*TotalHits*, TopDocs.*ScoreDocs*
and some removed methods/classes like
- Filter.Bits
- JustCompileSearch
- Contrib/Similarity.Net




 Replacing ArrayLists, Hashtables etc. with appropriate Generics.
 

 Key: LUCENENET-412
 URL: https://issues.apache.org/jira/browse/LUCENENET-412
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4

 Attachments: IEquatable for QuerySubclasses.patch, 
 LUCENENET-412.patch, lucene_2.9.4g_exceptions_fix


 This will move Lucene.Net.2.9.4 closer to lucene.3.0.3 and allow some 
 performance gains.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-414) The definition of CharArraySet is dangerously confusing and leads to bugs when used.

2011-05-18 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035817#comment-13035817
 ] 

Digy commented on LUCENENET-414:


Fixed in 2.9.4g
DIGY

 The definition of CharArraySet is dangerously confusing and leads to bugs 
 when used.
 

 Key: LUCENENET-414
 URL: https://issues.apache.org/jira/browse/LUCENENET-414
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: Irrelevant
Reporter: Vincent Van Den Berghe
Priority: Minor
 Fix For: Lucene.Net 2.9.2


 Right now, CharArraySet derives from System.Collections.Hashtable, but 
 doesn't actually use this base type for storing elements.
 However, the StandardAnalyzer.STOP_WORDS_SET is exposed as a 
 System.Collections.Hashtable. The trivial code to build your own stopword set 
 using the StandardAnalyzer.STOP_WORDS_SET and adding your own set of 
 stopwords like this:
 CharArraySet myStopWords = new CharArraySet(StandardAnalyzer.STOP_WORDS_SET, 
 ignoreCase: false);
 foreach (string domainSpecificStopWord in DomainSpecificStopWords)
 stopWords.Add(domainSpecificStopWord);
 ... will fail because the CharArraySet accepts an ICollection, which will be 
 passed the Hashtable instance of STOP_WORDS_SET: the resulting myStopWords 
 will only contain the DomainSpecificStopWords, and not those from 
 STOP_WORDS_SET.
 One workaround would be to replace the first line with this:
 CharArraySet stopWords = new 
 CharArraySet(StandardAnalyzer.STOP_WORDS_SET.Count + 
 DomainSpecificStopWords.Length, ignoreCase: false);
 foreach (string domainSpecificStopWord in 
 (CharArraySet)StandardAnalyzer.STOP_WORDS_SET)
 stopWords.Add(domainSpecificStopWord);
 ... but this makes use of the implementation detail (the STOP_WORDS_SET is 
 really an UnmodifiableCharArraySet which is itself a CharArraySet). It works 
 because it forces the foreach() to use the correct 
 CharArraySet.GetEnumerator(), which is defined as a new method (this has a 
 bad code smell to it)
 At least 2 possibilities exist to solve this problem:
 - Make CharArraySet use the Hashtable instance and a custom comparator, 
 instead of its own implementation.
 - Make CharArraySet use HashSetchar[], defined in .NET 4.0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-412) Replacing ArrayLists, Hashtables etc. with appropriate Generics.

2011-05-17 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035092#comment-13035092
 ] 

Digy commented on LUCENENET-412:


One more sample
{code}
From:
class AnonymousFilterCache : FilterCache
{
class AnonymousFilteredDocIdSet : FilteredDocIdSet
{
IndexReader r;
public AnonymousFilteredDocIdSet(DocIdSet innerSet, 
IndexReader r) : base(innerSet)
{
this.r = r;
}
public override bool Match(int docid)
{
return !r.IsDeleted(docid);
}
}

public AnonymousFilterCache(DeletesMode deletesMode) : 
base(deletesMode)
{
}

protected  override object MergeDeletes(IndexReader reader, 
object docIdSet)
{
return new 
AnonymousFilteredDocIdSet((DocIdSet)docIdSet, reader);
}
}   
...
cache = new AnonymousFilterCache(deletesMode);



To:
cache = new FilterCacheDocIdSet(deletesMode,
(reader,docIdSet)={
return new FilteredDocIdSet((DocIdSet)docIdSet, 
(docid) =
{
return !reader.IsDeleted(docid);
});
});
{code}

DIGY

 Replacing ArrayLists, Hashtables etc. with appropriate Generics.
 

 Key: LUCENENET-412
 URL: https://issues.apache.org/jira/browse/LUCENENET-412
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4

 Attachments: IEquatable for QuerySubclasses.patch, 
 LUCENENET-412.patch, lucene_2.9.4g_exceptions_fix


 This will move Lucene.Net.2.9.4 closer to lucene.3.0.3 and allow some 
 performance gains.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Resolved] (LUCENENET-405) Port: contrib/Analysis.NGram

2011-05-15 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy resolved LUCENENET-405.


Resolution: Fixed

committed to trunk  2.9.4g branch

 Port: contrib/Analysis.NGram
 

 Key: LUCENENET-405
 URL: https://issues.apache.org/jira/browse/LUCENENET-405
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4
Reporter: Digy
Assignee: Digy
Priority: Trivial
 Attachments: NGram.patch


 NGramTokenizer  EdgeNGramTokenizer + Test cases.
 DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-414) The definition of CharArraySet is dangerously confusing and leads to bugs when used.

2011-05-13 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032982#comment-13032982
 ] 

Digy commented on LUCENENET-414:


Hi Vincent,

I changed the CharArraySet implementation.
Can you take a look at 2.9.4g branch?

( 
https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_9_4g
 )

DIGY

 The definition of CharArraySet is dangerously confusing and leads to bugs 
 when used.
 

 Key: LUCENENET-414
 URL: https://issues.apache.org/jira/browse/LUCENENET-414
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.2
 Environment: Irrelevant
Reporter: Vincent Van Den Berghe
Priority: Minor
 Fix For: Lucene.Net 2.9.2


 Right now, CharArraySet derives from System.Collections.Hashtable, but 
 doesn't actually use this base type for storing elements.
 However, the StandardAnalyzer.STOP_WORDS_SET is exposed as a 
 System.Collections.Hashtable. The trivial code to build your own stopword set 
 using the StandardAnalyzer.STOP_WORDS_SET and adding your own set of 
 stopwords like this:
 CharArraySet myStopWords = new CharArraySet(StandardAnalyzer.STOP_WORDS_SET, 
 ignoreCase: false);
 foreach (string domainSpecificStopWord in DomainSpecificStopWords)
 stopWords.Add(domainSpecificStopWord);
 ... will fail because the CharArraySet accepts an ICollection, which will be 
 passed the Hashtable instance of STOP_WORDS_SET: the resulting myStopWords 
 will only contain the DomainSpecificStopWords, and not those from 
 STOP_WORDS_SET.
 One workaround would be to replace the first line with this:
 CharArraySet stopWords = new 
 CharArraySet(StandardAnalyzer.STOP_WORDS_SET.Count + 
 DomainSpecificStopWords.Length, ignoreCase: false);
 foreach (string domainSpecificStopWord in 
 (CharArraySet)StandardAnalyzer.STOP_WORDS_SET)
 stopWords.Add(domainSpecificStopWord);
 ... but this makes use of the implementation detail (the STOP_WORDS_SET is 
 really an UnmodifiableCharArraySet which is itself a CharArraySet). It works 
 because it forces the foreach() to use the correct 
 CharArraySet.GetEnumerator(), which is defined as a new method (this has a 
 bad code smell to it)
 At least 2 possibilities exist to solve this problem:
 - Make CharArraySet use the Hashtable instance and a custom comparator, 
 instead of its own implementation.
 - Make CharArraySet use HashSetchar[], defined in .NET 4.0.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Commented] (LUCENENET-266) Putting support classes in separate files and in a separate directory

2011-05-07 Thread Digy (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENENET-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030356#comment-13030356
 ] 

Digy commented on LUCENENET-266:


Hi Prescott,
Thank you for refactoring the SupportClass. Nice work, no failing tests.

DIGY

 Putting  support classes in separate files and in a separate directory
 --

 Key: LUCENENET-266
 URL: https://issues.apache.org/jira/browse/LUCENENET-266
 Project: Lucene.Net
  Issue Type: Improvement
  Components: Lucene.Net Core
Reporter: Andrei Iliev
Assignee: Prescott Nasser
  Labels: refactoring
 Fix For: Lucene.Net 2.9.4


 I am going to add some new classes (nio support, IdentityHashMap, ...) What 
 is the best place to put it in? SuportClass is getting bigger and bigger.
 I think it is  time to put all support classes in separate files and in a 
 separate directory (ex. JavaSupport).  Any comments? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-412) Replacing ArrayLists, Hashtables etc. with appropriate Generics.

2011-05-06 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-412:
---

Attachment: IEquatable for QuerySubclasses.patch

I am not sure about committing this IEquatable patch. To gain a slight 
performance improvement all Equals codes are dublicated. 

Here is the list of affected files:
ConstantScoreQuery.cs
DisjunctionMaxQuery.cs
FilteredQuery.cs
Function/CustomScoreQuery.cs
Function/ValueSourceQuery.cs
MatchAllDocsQuery.cs
MultiPhraseQuery.cs
MultiTermQuery.cs
Payloads/PayloadNearQuery.cs
Payloads/PayloadTermQuery.cs
PhraseQuery.cs
RangeQuery.cs
Spans/SpanFirstQuery.cs
Spans/SpanNearQuery.cs
Spans/SpanNotQuery.cs
Spans/SpanOrQuery.cs
Spans/SpanTermQuery.cs
TermQuery.cs


DIGY

 Replacing ArrayLists, Hashtables etc. with appropriate Generics.
 

 Key: LUCENENET-412
 URL: https://issues.apache.org/jira/browse/LUCENENET-412
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.4
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4

 Attachments: IEquatable for QuerySubclasses.patch, 
 LUCENENET-412.patch, lucene_2.9.4g_exceptions_fix


 This will move Lucene.Net.2.9.4 closer to lucene.3.0.3 and allow some 
 performance gains.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Updated] (LUCENENET-413) Medium trust security issue

2011-05-04 Thread Digy (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENENET-413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-413:
---

Attachment: MediumTrust.2.9.4.patch

constants.cs fix added into patch

  Medium trust security issue 
 -

 Key: LUCENENET-413
 URL: https://issues.apache.org/jira/browse/LUCENENET-413
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.4
 Environment: Lucene.Net 2.9.4, Lucene.Net 2.9.4g , .Net 4.0
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4

 Attachments: MediumTrust.2.9.4.patch, MediumTrust.2.9.4.patch, 
 MediumTrust.2.9.4g.patch


 On behalf of Richard Wilde:
 Exceptions in Medium Trust(.NET 4.0)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[Lucene.Net] [jira] [Created] (LUCENENET-413) Medium trust security issue

2011-05-03 Thread Digy (JIRA)
 Medium trust security issue 
-

 Key: LUCENENET-413
 URL: https://issues.apache.org/jira/browse/LUCENENET-413
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.4
 Environment: Lucene.Net 2.9.4, Lucene.Net 2.9.4g , .Net 4.0
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4


On behalf of Richard Wilde:
Exceptions in Medium Trust(.NET 4.0)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   >