date:20130913

[jira] [Updated] (LUCENE-5212) java 7u40 causes sigsegv and corrupt term vectors

2013-09-13 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5212:


Attachment: hs_err_pid32714.log

Attached is the hs_err

> java 7u40 causes sigsegv and corrupt term vectors
> -
>
> Key: LUCENE-5212
> URL: https://issues.apache.org/jira/browse/LUCENE-5212
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: hs_err_pid32714.log
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5212) java 7u40 causes sigsegv and corrupt term vectors

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767385#comment-13767385
 ] 

Robert Muir commented on LUCENE-5212:
-

This has happened twice in jenkins since Uwe upgraded, so i tried to reproduce 
myself.

With update 25, no issues.
So i ugpraded to update 40: on the first try with the jenkins commandline:

rmuir@beast:~/workspace/lucene-trunk/lucene/core$ ant test 
-Dtests.seed=43A1116E7F98BED4 -Dtests.jvms=1 -Dtests.dynamicAssignmentRatio=0 
-Dargs="-XX:-UseCompressedOops -XX:+UseParallelGC"

{noformat}
   [junit4] #
   [junit4] # A fatal error has been detected by the Java Runtime Environment:
   [junit4] #
   [junit4] #  SIGSEGV (0xb) at pc=0x7f163d2d34dd, pid=32714, 
tid=139732803393280
   [junit4] #
   [junit4] # JRE version: Java(TM) SE Runtime Environment (7.0_40-b43) (build 
1.7.0_40-b43)
   [junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.0-b56 mixed mode 
linux-amd64 )
   [junit4] # Problematic frame:
   [junit4] # J  
org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.get(I)Lorg/apache/lucene/index/Fields;
   [junit4] #
   [junit4] # Failed to write core dump. Core dumps have been disabled. To 
enable core dumping, try "ulimit -c unlimited" before starting Java again
   [junit4] #
   [junit4] # An error report file with more information is saved as:
   [junit4] # 
/home/rmuir/workspace/lucene-trunk/lucene/build/core/test/J0/hs_err_pid32714.log
{noformat}


> java 7u40 causes sigsegv and corrupt term vectors
> -
>
> Key: LUCENE-5212
> URL: https://issues.apache.org/jira/browse/LUCENE-5212
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5212) java 7u40 causes sigsegv and corrupt term vectors

2013-09-13 Thread Robert Muir (JIRA)

Robert Muir created LUCENE-5212:
---

 Summary: java 7u40 causes sigsegv and corrupt term vectors
 Key: LUCENE-5212
 URL: https://issues.apache.org/jira/browse/LUCENE-5212
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767368#comment-13767368
 ] 

Robert Muir commented on SOLR-5241:
---

if the test violates the security settings, then SecurityException would be 
thrown.

So maybe there is a bug in the solr code hiding/masking exceptions.

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch, SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767367#comment-13767367
 ] 

Shai Erera commented on SOLR-5241:
--

Hoss, see my comments on the thread. Seems to be related to our tests.policy 
security settings. I don't mind if you commit the patch, but maybe if there's a 
simple solution, we don't need to change the test.

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch, SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Shai Erera

OK, I think I've made some progress -- configuration issue, but not
proxies, it seems to be our security manager/policy.

I printed system properties and env in setUp and ran only a single test
(testIsOn) to compare the output. I didn't notice any proxy settings, but I
did notice that from Ant we're using our own security manager and policy.
So when I ran the test from eclipse using
-Djava.security.manager=org.apache.lucene.util.TestSecurityManager
-Djava.security.policy=D:\dev\lucene\lucene-trunk\lucene\tools\junit4\tests.policy,
it ran for 23s too (I only ran a single test method).

I don't think it's related to TestSecurityManager, since it only checks
that System.exit isn't called from tests.
Looking at tests.policy, there are a bunch of socket connection related
lines .. I'm guess it's somewhere there, though I don't know this stuff.

I will try to look into it more later.

Shai


On Sat, Sep 14, 2013 at 6:59 AM, Shai Erera  wrote:

> I still don't see why you'd get different timings between Eclipse and
>> Ant if you're running with the same VM -- it should be pretty
>> consistent (either it caches dns lookups or it doesn't).
>>
>
> I agree, it's suspicious. I searched for URL performance differences
> between eclipse and Ant and hit this page:
> http://ant.apache.org/manual/proxy.html. It suggests Ant uses different
> proxy settings by default for its own tasks as well as 3rd party tasks that
> use java.net.URL. I tried running with -autoproxy but from Ant each test
> method still takes ~23s vs Eclipse which is ~0.1s.
>
> Will be interesting to identify the differences .. I think it's a
> configuration issue, as Eclipse needs to make URL connections for e.g. its
> marketplace, so maybe it comes pre-configured somehow. I'm now curious,
> I'll dig :).
>
> Shai
>
>
> On Sat, Sep 14, 2013 at 12:33 AM, Chris Hostetter <
> hossman_luc...@fucit.org> wrote:
>
>>
>> : If you want to experiment, a really trivial test is below -- on my
>> system,
>> : there is a ~5 second gap between each pair of "INIT" and "H1"
>> timestamps,
>> : but no other odd gaps in subsequent timestamps -- suggesting no caching
>> of
>> : DNS per hostname, but that the URL class doesn't "re-lookup" on
>> subsequent
>> : hashCode() calls.
>>
>> Or maybe i could actually cut & paste the test this time...
>>
>>
>> import java.net.URL;
>> import org.apache.lucene.util.LuceneTestCase;
>> public class TestURLDNSAbsurdity extends LuceneTestCase {
>>
>>   public void testHowSlowCanYouGo() throws Exception {
>> go("1");
>> go("2");
>>   }
>>   public void go(String s) throws Exception {
>> System.out.println(s + " PRE: " + System.currentTimeMillis());
>> URL url = new URL("http://example.com/";);
>> System.out.println(s + "INIT: " + System.currentTimeMillis());
>> int h1 = url.hashCode();
>> System.out.println(s + "  H1: " + System.currentTimeMillis());
>> int h2 = url.hashCode();
>> System.out.println(s + "  H2: " + System.currentTimeMillis());
>> boolean e1 = url.equals(this);
>> System.out.println(s + "  E1: " + System.currentTimeMillis());
>> boolean e2 = url.equals(this);
>> System.out.println(s + "  E2: " + System.currentTimeMillis());
>> assertEquals(h1,h2);
>> assertEquals(e1,e2);
>>   }
>> }
>>
>> ...
>>
>>[junit4] Started J0 PID(31843@frisbee).
>>[junit4] Suite: org.apache.solr.util.TestURLDNSAbsurdity
>>[junit4]   1> 1 PRE: 1379107971313
>>[junit4]   1> 1INIT: 1379107971314
>>[junit4]   1> 1  H1: 1379107976449
>>[junit4]   1> 1  H2: 1379107976449
>>[junit4]   1> 1  E1: 1379107976449
>>[junit4]   1> 1  E2: 1379107976449
>>[junit4]   1> 2 PRE: 1379107976450
>>[junit4]   1> 2INIT: 1379107976450
>>[junit4]   1> 2  H1: 1379107981510
>>[junit4]   1> 2  H2: 1379107981510
>>[junit4]   1> 2  E1: 1379107981510
>>[junit4]   1> 2  E2: 1379107981510
>>[junit4] OK  10.3s | TestURLDNSAbsurdity.testHowSlowCanYouGo
>>
>>
>>
>>
>> -Hoss
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

[jira] [Comment Edited] (SOLR-4787) Join Contrib

2013-09-13 Thread Kranti Parisa (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767361#comment-13767361
 ] 

Kranti Parisa edited comment on SOLR-4787 at 9/14/13 4:16 AM:
--

Something is missing in the Patch? I am seeing ByteArray compilation problem. 
Also does bjoin needs any specific types of field configs in schema.xml ?

  was (Author: krantiparisa):
Something missing in the Patch? I am seeing ByteArray compilation problem. 
Also does bjoin needs any specific types of field configs in schema.xml ?
  
> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *HashSetJoinQParserPlugin aka hjoin*
> The hjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the hjoin is designed to work with int and long join 
> keys only. So, in order to use hjoin, int or long join keys must be included 
> in both the to and from core.
> The second difference is that the hjoin builds memory structures that are 
> used to quickly connect the join keys. So, the hjoin will need more memory 
> then the JoinQParserPlugin to perform the join.
> The main advantage of the hjoin is that it can scale to join millions of keys 
> between cores and provide sub-second response time. The hjoin should work 
> well with up to two million results from the fromIndex and tens of millions 
> of results from the main query.
> The hjoin supports the following features:
> 1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will 
> turn on the PostFilter. The PostFilter will typically outperform the Lucene 
> query when the main query results have been narrowed down.
> 2) With the lucene query implementation there is an option to build the 
> filter with threads. This can greatly improve the performance of the query if 
> the main query index is very large. The "threads" parameter turns on 
> threading. For example *threads=6* will use 6 threads to build the filter. 
> This will setup a fixed threadpool with six threads to handle all hjoin 
> requests. Once the threadpool is created the hjoin will always use it to 
> build the filter. Threading does not come into play with the PostFilter.
> 3) The *size* local parameter can be used to set the initial size of the 
> hashset used to perform the join. If this is set above the number of results 
> from the fromIndex then the you can avoid hashset resizing which improves 
> performance.
> 4) Nested filter queries. The local parameter "fq" can be used to nest a 
> filter query within the join. The nested fq will filter the results of the 
> join query. This can point to another join to support nested joins.
> 5) Full caching support for the lucene query implementation. The filterCache 
> and queryResultCache should work properly even with deep nesting of joins. 
> Only the queryResultCache comes into play with the PostFilter implementation 
> because PostFilters are not cacheable in the filterCache.
> The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
> plugin is referenced by the string "hjoin" rather then "join".
> fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
> fq=$qq\}user:customer1&qq=group:5
> The example filter query above will search the fromIndex (collection2) for 
> "user:customer1" applying the local fq parameter to filter the results. The 
> lucene filter query will be built using 6 threads. This query will generate a 
> list of values from the "from" field that will be used to filter the main 
> query. Only records from the main query, where the "to" field is present in 
> the "from" list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.HashSetJoinQPar

[jira] [Commented] (SOLR-4787) Join Contrib

2013-09-13 Thread Kranti Parisa (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767361#comment-13767361
 ] 

Kranti Parisa commented on SOLR-4787:
-

Something missing in the Patch? I am seeing ByteArray compilation problem. Also 
does bjoin needs any specific types of field configs in schema.xml ?

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *HashSetJoinQParserPlugin aka hjoin*
> The hjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the hjoin is designed to work with int and long join 
> keys only. So, in order to use hjoin, int or long join keys must be included 
> in both the to and from core.
> The second difference is that the hjoin builds memory structures that are 
> used to quickly connect the join keys. So, the hjoin will need more memory 
> then the JoinQParserPlugin to perform the join.
> The main advantage of the hjoin is that it can scale to join millions of keys 
> between cores and provide sub-second response time. The hjoin should work 
> well with up to two million results from the fromIndex and tens of millions 
> of results from the main query.
> The hjoin supports the following features:
> 1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will 
> turn on the PostFilter. The PostFilter will typically outperform the Lucene 
> query when the main query results have been narrowed down.
> 2) With the lucene query implementation there is an option to build the 
> filter with threads. This can greatly improve the performance of the query if 
> the main query index is very large. The "threads" parameter turns on 
> threading. For example *threads=6* will use 6 threads to build the filter. 
> This will setup a fixed threadpool with six threads to handle all hjoin 
> requests. Once the threadpool is created the hjoin will always use it to 
> build the filter. Threading does not come into play with the PostFilter.
> 3) The *size* local parameter can be used to set the initial size of the 
> hashset used to perform the join. If this is set above the number of results 
> from the fromIndex then the you can avoid hashset resizing which improves 
> performance.
> 4) Nested filter queries. The local parameter "fq" can be used to nest a 
> filter query within the join. The nested fq will filter the results of the 
> join query. This can point to another join to support nested joins.
> 5) Full caching support for the lucene query implementation. The filterCache 
> and queryResultCache should work properly even with deep nesting of joins. 
> Only the queryResultCache comes into play with the PostFilter implementation 
> because PostFilters are not cacheable in the filterCache.
> The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
> plugin is referenced by the string "hjoin" rather then "join".
> fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
> fq=$qq\}user:customer1&qq=group:5
> The example filter query above will search the fromIndex (collection2) for 
> "user:customer1" applying the local fq parameter to filter the results. The 
> lucene filter query will be built using 6 threads. This query will generate a 
> list of values from the "from" field that will be used to filter the main 
> query. Only records from the main query, where the "to" field is present in 
> the "from" list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.HashSetJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
>  
>  
> *BitSetJoinQParserPlugin aka bjoin*
> The bjoin behaves exactly like the hjoin but uses a BitSet instead of a 
> HashSet to perform the underlying join. Because of this

Re: SimplePostToolTest very slow

2013-09-13 Thread Shai Erera

>
> I still don't see why you'd get different timings between Eclipse and
> Ant if you're running with the same VM -- it should be pretty
> consistent (either it caches dns lookups or it doesn't).
>

I agree, it's suspicious. I searched for URL performance differences
between eclipse and Ant and hit this page:
http://ant.apache.org/manual/proxy.html. It suggests Ant uses different
proxy settings by default for its own tasks as well as 3rd party tasks that
use java.net.URL. I tried running with -autoproxy but from Ant each test
method still takes ~23s vs Eclipse which is ~0.1s.

Will be interesting to identify the differences .. I think it's a
configuration issue, as Eclipse needs to make URL connections for e.g. its
marketplace, so maybe it comes pre-configured somehow. I'm now curious,
I'll dig :).

Shai


On Sat, Sep 14, 2013 at 12:33 AM, Chris Hostetter
wrote:

>
> : If you want to experiment, a really trivial test is below -- on my
> system,
> : there is a ~5 second gap between each pair of "INIT" and "H1" timestamps,
> : but no other odd gaps in subsequent timestamps -- suggesting no caching
> of
> : DNS per hostname, but that the URL class doesn't "re-lookup" on
> subsequent
> : hashCode() calls.
>
> Or maybe i could actually cut & paste the test this time...
>
>
> import java.net.URL;
> import org.apache.lucene.util.LuceneTestCase;
> public class TestURLDNSAbsurdity extends LuceneTestCase {
>
>   public void testHowSlowCanYouGo() throws Exception {
> go("1");
> go("2");
>   }
>   public void go(String s) throws Exception {
> System.out.println(s + " PRE: " + System.currentTimeMillis());
> URL url = new URL("http://example.com/";);
> System.out.println(s + "INIT: " + System.currentTimeMillis());
> int h1 = url.hashCode();
> System.out.println(s + "  H1: " + System.currentTimeMillis());
> int h2 = url.hashCode();
> System.out.println(s + "  H2: " + System.currentTimeMillis());
> boolean e1 = url.equals(this);
> System.out.println(s + "  E1: " + System.currentTimeMillis());
> boolean e2 = url.equals(this);
> System.out.println(s + "  E2: " + System.currentTimeMillis());
> assertEquals(h1,h2);
> assertEquals(e1,e2);
>   }
> }
>
> ...
>
>[junit4] Started J0 PID(31843@frisbee).
>[junit4] Suite: org.apache.solr.util.TestURLDNSAbsurdity
>[junit4]   1> 1 PRE: 1379107971313
>[junit4]   1> 1INIT: 1379107971314
>[junit4]   1> 1  H1: 1379107976449
>[junit4]   1> 1  H2: 1379107976449
>[junit4]   1> 1  E1: 1379107976449
>[junit4]   1> 1  E2: 1379107976449
>[junit4]   1> 2 PRE: 1379107976450
>[junit4]   1> 2INIT: 1379107976450
>[junit4]   1> 2  H1: 1379107981510
>[junit4]   1> 2  H2: 1379107981510
>[junit4]   1> 2  E1: 1379107981510
>[junit4]   1> 2  E2: 1379107981510
>[junit4] OK  10.3s | TestURLDNSAbsurdity.testHowSlowCanYouGo
>
>
>
>
> -Hoss
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767351#comment-13767351
 ] 

Shai Erera commented on SOLR-5241:
--

This still runs fast, but a bit slower than with the previous patch: 5s from 
Ant. However, with this patch (I haven't checked previous patch), Eclipse runs 
faster than before: 0.1s (vs 1s). I'll try to get to the bottom of it, though 
let's commit this (or previous) patch, because it's already a huge improvement.

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch, SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767328#comment-13767328
 ] 

Robert Muir commented on SOLR-5241:
---

{quote}
I've got no problem doing that if you think it makes a diff – but just so i 
understand: can you explain why that is better then 127.42.42.42 ?
{quote}

its specifically allocated for test purposes and wont be routed if something 
tries to make a connection, and will fail fast with protocol not supported, or 
worst case no route to host... everywhere, not just jenkins.

in jenkins specifically, tests can *never* try to connect to an unbound port 
and expected a connection refused, it will just hang for a huge amount of time 
until it finally times out.



> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch, SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4906) PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767301#comment-13767301
 ] 

Robert Muir commented on LUCENE-4906:
-

Can we fix the visibility of this method to be protected?


> PostingsHighlighter's PassageFormatter should allow for rendering to 
> arbitrary objects
> --
>
> Key: LUCENE-4906
> URL: https://issues.apache.org/jira/browse/LUCENE-4906
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-4906.patch, LUCENE-4906.patch, LUCENE-4906.patch
>
>
> For example, in a server, I may want to render the highlight result to 
> JsonObject to send back to the front-end. Today since we render to string, I 
> have to render to JSON string and then re-parse to JsonObject, which is 
> inefficient...
> Or, if (Rob's idea:) we make a query that's like MoreLikeThis but it pulls 
> terms from snippets instead, so you get proximity-influenced salient/expanded 
> terms, then perhaps that renders to just an array of tokens or fragments or 
> something from each snippet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5210) Unit tests for LicenseCheckTask.

2013-09-13 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-5210:


Attachment: LUCENE-5210.patch

Here is a more complete patch.

> Unit tests for LicenseCheckTask.
> 
>
> Key: LUCENE-5210
> URL: https://issues.apache.org/jira/browse/LUCENE-5210
> Project: Lucene - Core
>  Issue Type: Test
>  Components: general/build
>Reporter: Mark Miller
> Attachments: LUCENE-5210.patch, LUCENE-5210.patch
>
>
> While working on LUCENE-5209, I noticed the LicenseCheckTask is kind of a 
> second class citizen - excluded from UI src folder setup and with no units 
> tests. This was a little scary to me.
> I've started adding some units tests. So far I have mainly just done the 
> lifting of getting units tests to work as part of tools.
> I have added two super simple tests - really just the start - but something 
> to build on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Joins on the confluence wiki

2013-09-13 Thread Kranti Parisa

Cool, thanks.

Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Fri, Sep 13, 2013 at 8:17 PM, Erick Erickson wrote:

> I added you to the Solr contributors group, you should be able to edit the
> Wiki now.
>
> The Confluence thing is a bit more restricted, but we can always use the
> Wiki page as a template...
>
> Thanks for helping!
>
> Erick
>
>
>
> On Fri, Sep 13, 2013 at 6:12 PM, Kranti Parisa wrote:
>
>> Yes, there is a wiki page but I can't edit that too.
>>
>> If you can create a page on Solr Ref Guide for Joins under Searching, I
>> can work with you for the same if I am not allowed to edit directly!
>>
>> Thanks & Regards,
>> Kranti K Parisa
>> http://www.linkedin.com/in/krantiparisa
>>
>>
>>
>> On Fri, Sep 13, 2013 at 1:22 PM, Cassandra Targett > > wrote:
>>
>>> Kanti, are you referring to the Solr Ref Guide (which is a Confluence
>>> wiki)? I notice that there is a page already in the Solr wiki about
>>> joins: http://wiki.apache.org/solr/Join, but not one in the Ref Guide
>>> yet.
>>>
>>> Policies for editing the Solr Ref Guide are different from the Solr
>>> wiki, and are here:
>>>
>>> https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation
>>> .
>>>
>>> If you can't get access to create content for the Solr Ref Guide, you
>>> can still make comments with suggestions for improvements. If you do
>>> that, I'll be happy to add the new page and work with you to make it
>>> right.
>>>
>>> Cassandra
>>>
>>> On Fri, Sep 13, 2013 at 8:06 AM, Erick Erickson 
>>> wrote:
>>> > Just let us know your Wiki user ID and we'll add you
>>> > to the approved list right away.
>>> >
>>> > Had some trouble with spam bots a while back so had to go
>>> > this route.
>>> >
>>> > Thanks for volunteering to help!
>>> >
>>> > Erick
>>> >
>>> >
>>> > On Thu, Sep 12, 2013 at 9:16 PM, Kranti Parisa <
>>> kranti.par...@gmail.com>
>>> > wrote:
>>> >>
>>> >> Guys,
>>> >>
>>> >> Seems there is not wiki page for Joins. I have been using/working
>>> Joins
>>> >> and I want to start writing a page for the same on the Confluence
>>> wiki. How
>>> >> can I get access for adding/editing the wiki pages?
>>> >>
>>> >> Thanks & Regards,
>>> >> Kranti K Parisa
>>> >> http://www.linkedin.com/in/krantiparisa
>>> >>
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>

[jira] [Commented] (LUCENE-5211) StopFilterFactory docs do not advertise/explain hte "format" option

2013-09-13 Thread Hayden Muhl (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767238#comment-13767238
 ] 

Hayden Muhl commented on LUCENE-5211:
-

Ah, very good. I was a bit shocked when my French stop words weren't working. 
This seemed like too big of a functionality bug to be easily missed.

> StopFilterFactory docs do not advertise/explain hte "format" option
> ---
>
> Key: LUCENE-5211
> URL: https://issues.apache.org/jira/browse/LUCENE-5211
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.2
>Reporter: Hayden Muhl
>Assignee: Hoss Man
>Priority: Minor
>
> StopFilterFactory supports a "format" option for controlling wether 
> "getWordSet" or "getSnowballWordSet" is used to parse the file, but this 
> option is not advertised and people can be confused by looking at the example 
> stopword files include in the releases (some of which are in the snoball 
> format w/ "|" comments) and try to use them w/o explicitly specifying 
> {{format="snowball"}} and silently get useless stopwords (that include the "| 
> comments" as literal portions of hte stopwrds.
> we need to better document the use of "format" and consider updating all of 
> the example stopword files we ship that are in the snowball format with a 
> note about the need to use {{format="snowball"}} with those files.
> {panel:title=Initial Bug Report}
> The StopFilterFactory builds a CharArraySet directly from the raw lines of 
> the supplied words file. This causes a problem when using the stop word files 
> supplied with the Solr/Lucene distribution. In particular, the comments in 
> those files get added to the CharArraySet. A line like this...
> ceci   |  this
> Should result in the string "ceci" being added to the CharArraySet, but "ceci 
>   |  this" is what actually gets added.
> Workaround: Remove all comments from stop word files you are using.
> Suggested fix: The StopFilterFactory should strip any comments, then strip 
> trailing whitespace. The stop word files supplied with the distribution 
> should be edited to conform to the supported comment format.
> {panel}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Joins on the confluence wiki

2013-09-13 Thread Erick Erickson

I added you to the Solr contributors group, you should be able to edit the
Wiki now.

The Confluence thing is a bit more restricted, but we can always use the
Wiki page as a template...

Thanks for helping!

Erick



On Fri, Sep 13, 2013 at 6:12 PM, Kranti Parisa wrote:

> Yes, there is a wiki page but I can't edit that too.
>
> If you can create a page on Solr Ref Guide for Joins under Searching, I
> can work with you for the same if I am not allowed to edit directly!
>
> Thanks & Regards,
> Kranti K Parisa
> http://www.linkedin.com/in/krantiparisa
>
>
>
> On Fri, Sep 13, 2013 at 1:22 PM, Cassandra Targett 
> wrote:
>
>> Kanti, are you referring to the Solr Ref Guide (which is a Confluence
>> wiki)? I notice that there is a page already in the Solr wiki about
>> joins: http://wiki.apache.org/solr/Join, but not one in the Ref Guide
>> yet.
>>
>> Policies for editing the Solr Ref Guide are different from the Solr
>> wiki, and are here:
>>
>> https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation
>> .
>>
>> If you can't get access to create content for the Solr Ref Guide, you
>> can still make comments with suggestions for improvements. If you do
>> that, I'll be happy to add the new page and work with you to make it
>> right.
>>
>> Cassandra
>>
>> On Fri, Sep 13, 2013 at 8:06 AM, Erick Erickson 
>> wrote:
>> > Just let us know your Wiki user ID and we'll add you
>> > to the approved list right away.
>> >
>> > Had some trouble with spam bots a while back so had to go
>> > this route.
>> >
>> > Thanks for volunteering to help!
>> >
>> > Erick
>> >
>> >
>> > On Thu, Sep 12, 2013 at 9:16 PM, Kranti Parisa > >
>> > wrote:
>> >>
>> >> Guys,
>> >>
>> >> Seems there is not wiki page for Joins. I have been using/working Joins
>> >> and I want to start writing a page for the same on the Confluence
>> wiki. How
>> >> can I get access for adding/editing the wiki pages?
>> >>
>> >> Thanks & Regards,
>> >> Kranti K Parisa
>> >> http://www.linkedin.com/in/krantiparisa
>> >>
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

[jira] [Updated] (LUCENE-5211) StopFilterFactory docs do not advertise/explain hte "format" option

2013-09-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-5211:
-

Component/s: (was: core/search)
Description: 
StopFilterFactory supports a "format" option for controlling wether 
"getWordSet" or "getSnowballWordSet" is used to parse the file, but this option 
is not advertised and people can be confused by looking at the example stopword 
files include in the releases (some of which are in the snoball format w/ "|" 
comments) and try to use them w/o explicitly specifying {{format="snowball"}} 
and silently get useless stopwords (that include the "| comments" as literal 
portions of hte stopwrds.

we need to better document the use of "format" and consider updating all of the 
example stopword files we ship that are in the snowball format with a note 
about the need to use {{format="snowball"}} with those files.

{panel:title=Initial Bug Report}

The StopFilterFactory builds a CharArraySet directly from the raw lines of the 
supplied words file. This causes a problem when using the stop word files 
supplied with the Solr/Lucene distribution. In particular, the comments in 
those files get added to the CharArraySet. A line like this...

ceci   |  this

Should result in the string "ceci" being added to the CharArraySet, but "ceci   
|  this" is what actually gets added.

Workaround: Remove all comments from stop word files you are using.

Suggested fix: The StopFilterFactory should strip any comments, then strip 
trailing whitespace. The stop word files supplied with the distribution should 
be edited to conform to the supported comment format.
{panel}

  was:
The StopFilterFactory builds a CharArraySet directly from the raw lines of the 
supplied words file. This causes a problem when using the stop word files 
supplied with the Solr/Lucene distribution. In particular, the comments in 
those files get added to the CharArraySet. A line like this...

ceci   |  this

Should result in the string "ceci" being added to the CharArraySet, but "ceci   
|  this" is what actually gets added.

Workaround: Remove all comments from stop word files you are using.

Suggested fix: The StopFilterFactory should strip any comments, then strip 
trailing whitespace. The stop word files supplied with the distribution should 
be edited to conform to the supported comment format.

   Priority: Minor  (was: Major)
   Assignee: Hoss Man
Summary: StopFilterFactory docs do not advertise/explain hte "format" 
option  (was: StopFilterFactory does not honor comments)

> StopFilterFactory docs do not advertise/explain hte "format" option
> ---
>
> Key: LUCENE-5211
> URL: https://issues.apache.org/jira/browse/LUCENE-5211
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.2
>Reporter: Hayden Muhl
>Assignee: Hoss Man
>Priority: Minor
>
> StopFilterFactory supports a "format" option for controlling wether 
> "getWordSet" or "getSnowballWordSet" is used to parse the file, but this 
> option is not advertised and people can be confused by looking at the example 
> stopword files include in the releases (some of which are in the snoball 
> format w/ "|" comments) and try to use them w/o explicitly specifying 
> {{format="snowball"}} and silently get useless stopwords (that include the "| 
> comments" as literal portions of hte stopwrds.
> we need to better document the use of "format" and consider updating all of 
> the example stopword files we ship that are in the snowball format with a 
> note about the need to use {{format="snowball"}} with those files.
> {panel:title=Initial Bug Report}
> The StopFilterFactory builds a CharArraySet directly from the raw lines of 
> the supplied words file. This causes a problem when using the stop word files 
> supplied with the Solr/Lucene distribution. In particular, the comments in 
> those files get added to the CharArraySet. A line like this...
> ceci   |  this
> Should result in the string "ceci" being added to the CharArraySet, but "ceci 
>   |  this" is what actually gets added.
> Workaround: Remove all comments from stop word files you are using.
> Suggested fix: The StopFilterFactory should strip any comments, then strip 
> trailing whitespace. The stop word files supplied with the distribution 
> should be edited to conform to the supported comment format.
> {panel}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev

[jira] [Commented] (LUCENE-5211) StopFilterFactory does not honor comments

2013-09-13 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767145#comment-13767145
 ] 

Hoss Man commented on LUCENE-5211:
--

The StopFilterFactory supports two different "formats" of stop word files, the 
default format that has been supported since day #1 allows comments using "#", 
but more recently support was added for the "snowball" stopword format which is 
what is used in the stopwords_fr.txt file you seem to be refering to.

the example usage of stopwords_fr.txt in solr explicitly configures the 
StopFilterFactory so that it knows the file is in the "smowball" format...

{noformat}

{noformat}

So there doesn't seem to any functionaly bug here -- just a documntation issue: 
when support was added for the "snowball" format, it appears that nothing was 
added to the class javadocs of hte factory to make this clear.

If no one beats me to it, i'll clean this up next week.

> StopFilterFactory does not honor comments
> -
>
> Key: LUCENE-5211
> URL: https://issues.apache.org/jira/browse/LUCENE-5211
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.2
>Reporter: Hayden Muhl
>
> The StopFilterFactory builds a CharArraySet directly from the raw lines of 
> the supplied words file. This causes a problem when using the stop word files 
> supplied with the Solr/Lucene distribution. In particular, the comments in 
> those files get added to the CharArraySet. A line like this...
> ceci   |  this
> Should result in the string "ceci" being added to the CharArraySet, but "ceci 
>   |  this" is what actually gets added.
> Workaround: Remove all comments from stop word files you are using.
> Suggested fix: The StopFilterFactory should strip any comments, then strip 
> trailing whitespace. The stop word files supplied with the distribution 
> should be edited to conform to the supported comment format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5211) StopFilterFactory does not honor comments

2013-09-13 Thread Hayden Muhl (JIRA)

Hayden Muhl created LUCENE-5211:
---

 Summary: StopFilterFactory does not honor comments
 Key: LUCENE-5211
 URL: https://issues.apache.org/jira/browse/LUCENE-5211
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 4.2
Reporter: Hayden Muhl


The StopFilterFactory builds a CharArraySet directly from the raw lines of the 
supplied words file. This causes a problem when using the stop word files 
supplied with the Solr/Lucene distribution. In particular, the comments in 
those files get added to the CharArraySet. A line like this...

ceci   |  this

Should result in the string "ceci" being added to the CharArraySet, but "ceci   
|  this" is what actually gets added.

Workaround: Remove all comments from stop word files you are using.

Suggested fix: The StopFilterFactory should strip any comments, then strip 
trailing whitespace. The stop word files supplied with the distribution should 
be edited to conform to the supported comment format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5238) Update the .css for the Ref Guide

2013-09-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-5238.


Resolution: Fixed

> Update the .css for the Ref Guide
> -
>
> Key: SOLR-5238
> URL: https://issues.apache.org/jira/browse/SOLR-5238
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 4.5
>
> Attachments: SolrRefGuide.css
>
>
> I put a custom .css into the Ref Guide before it was uploaded. I was going 
> for something parallel to the Solr website, but only spent a little time with 
> it. In terms of readibility of the text online, it's not great, which is 
> putting it nicely. It's also very difficult to differentiate between "normal" 
> text and monospaced text to indicate a command, program name, etc.
> I'm attaching a new .css that can simply replace what's already in the Space 
> Admin -> Stylesheet section. I did several things with this:
> * cleaned up the .css generally, consolidated some repetitive sections, and 
> added more comments in case future changes are desired.
> * changed the font throughout to Helvetica, Arial, or sans-serif and updated 
> the color to a slightly less strong black.
> * changed the monospace font to match the font used in the code boxes 
> (Consolas) and made them the same color as the text (default is a lot 
> lighter).
> * added a bit more space between lines.
> * removed the negative margin in the header/breadcrumbs to give it a bit more 
> space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5238) Update the .css for the Ref Guide

2013-09-13 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767092#comment-13767092
 ] 

Hoss Man commented on SOLR-5238:


yeah... that looks much better.

also fixes weirdness with the scroll bar in the left nav that i didnt' notice 
until you drew my attention to the footer.

thanks.

> Update the .css for the Ref Guide
> -
>
> Key: SOLR-5238
> URL: https://issues.apache.org/jira/browse/SOLR-5238
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 4.5
>
> Attachments: SolrRefGuide.css
>
>
> I put a custom .css into the Ref Guide before it was uploaded. I was going 
> for something parallel to the Solr website, but only spent a little time with 
> it. In terms of readibility of the text online, it's not great, which is 
> putting it nicely. It's also very difficult to differentiate between "normal" 
> text and monospaced text to indicate a command, program name, etc.
> I'm attaching a new .css that can simply replace what's already in the Space 
> Admin -> Stylesheet section. I did several things with this:
> * cleaned up the .css generally, consolidated some repetitive sections, and 
> added more comments in case future changes are desired.
> * changed the font throughout to Helvetica, Arial, or sans-serif and updated 
> the color to a slightly less strong black.
> * changed the monospace font to match the font used in the code boxes 
> (Consolas) and made them the same color as the text (default is a lot 
> lighter).
> * added a bit more space between lines.
> * removed the negative margin in the header/breadcrumbs to give it a bit more 
> space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (SOLR-5238) Update the .css for the Ref Guide

2013-09-13 Thread Cassandra Targett (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett reopened SOLR-5238:
-


I don't really understand Confluence sometimes...

I don't think this was there before, but now there is a big blank space at the 
bottom of the pages (I checked in FF and Chrome). Playing around with the 
inspector tools in the browser, it goes away if I change the #footer position 
to relative.

If you add this to the bottom of the stylesheet I think it should go away in 
the live site:

{code}
#footer {
position: relative !important;
}
{code}

Since it's the only #footer called in the custom css, and the custom css is the 
last one loaded, it should be OK at the end.

> Update the .css for the Ref Guide
> -
>
> Key: SOLR-5238
> URL: https://issues.apache.org/jira/browse/SOLR-5238
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 4.5
>
> Attachments: SolrRefGuide.css
>
>
> I put a custom .css into the Ref Guide before it was uploaded. I was going 
> for something parallel to the Solr website, but only spent a little time with 
> it. In terms of readibility of the text online, it's not great, which is 
> putting it nicely. It's also very difficult to differentiate between "normal" 
> text and monospaced text to indicate a command, program name, etc.
> I'm attaching a new .css that can simply replace what's already in the Space 
> Admin -> Stylesheet section. I did several things with this:
> * cleaned up the .css generally, consolidated some repetitive sections, and 
> added more comments in case future changes are desired.
> * changed the font throughout to Helvetica, Arial, or sans-serif and updated 
> the color to a slightly less strong black.
> * changed the monospace font to match the font used in the code boxes 
> (Consolas) and made them the same color as the text (default is a lot 
> lighter).
> * added a bit more space between lines.
> * removed the negative margin in the header/breadcrumbs to give it a bit more 
> space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Joins on the confluence wiki

2013-09-13 Thread Kranti Parisa

Yes, there is a wiki page but I can't edit that too.

If you can create a page on Solr Ref Guide for Joins under Searching, I can
work with you for the same if I am not allowed to edit directly!

Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Fri, Sep 13, 2013 at 1:22 PM, Cassandra Targett wrote:

> Kanti, are you referring to the Solr Ref Guide (which is a Confluence
> wiki)? I notice that there is a page already in the Solr wiki about
> joins: http://wiki.apache.org/solr/Join, but not one in the Ref Guide
> yet.
>
> Policies for editing the Solr Ref Guide are different from the Solr
> wiki, and are here:
>
> https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation
> .
>
> If you can't get access to create content for the Solr Ref Guide, you
> can still make comments with suggestions for improvements. If you do
> that, I'll be happy to add the new page and work with you to make it
> right.
>
> Cassandra
>
> On Fri, Sep 13, 2013 at 8:06 AM, Erick Erickson 
> wrote:
> > Just let us know your Wiki user ID and we'll add you
> > to the approved list right away.
> >
> > Had some trouble with spam bots a while back so had to go
> > this route.
> >
> > Thanks for volunteering to help!
> >
> > Erick
> >
> >
> > On Thu, Sep 12, 2013 at 9:16 PM, Kranti Parisa 
> > wrote:
> >>
> >> Guys,
> >>
> >> Seems there is not wiki page for Joins. I have been using/working Joins
> >> and I want to start writing a page for the same on the Confluence wiki.
> How
> >> can I get access for adding/editing the wiki pages?
> >>
> >> Thanks & Regards,
> >> Kranti K Parisa
> >> http://www.linkedin.com/in/krantiparisa
> >>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Re: Joins on the confluence wiki

2013-09-13 Thread Kranti Parisa

Erick,

My id is "*krantiparisa*"



Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Fri, Sep 13, 2013 at 9:06 AM, Erick Erickson wrote:

> Just let us know your Wiki user ID and we'll add you
> to the approved list right away.
>
> Had some trouble with spam bots a while back so had to go
> this route.
>
> Thanks for volunteering to help!
>
> Erick
>
>
> On Thu, Sep 12, 2013 at 9:16 PM, Kranti Parisa wrote:
>
>> Guys,
>>
>> Seems there is not wiki page for Joins. I have been using/working Joins
>> and I want to start writing a page for the same on the Confluence wiki. How
>> can I get access for adding/editing the wiki pages?
>>
>> Thanks & Regards,
>> Kranti K Parisa
>> http://www.linkedin.com/in/krantiparisa
>>
>>
>

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766996#comment-13766996
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523114 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523114 ]

LUCENE-5207: load available javascript functions from resource file (properties)

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5210) Unit tests for LicenseCheckTask.

2013-09-13 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-5210:


Description: 
While working on LUCENE-5209, I noticed the LicenseCheckTask is kind of a 
second class citizen - excluded from UI src folder setup and with no units 
tests. This was a little scary to me.

I've started adding some units tests. So far I have mainly just done the 
lifting of getting units tests to work as part of tools.

I have added two super simple tests - really just the start - but something to 
build on.

  was:
While working on LUCENE-5209, I noticed the LicenseCheckerTask is kind of a 
second class citizen - excluded from UI src folder setup and with no units 
tests. This was a little scary to me.

I've started adding some units tests. So far I have mainly just done the 
lifting of getting units tests to work as part of tools.

I have added two super simple tests - really just the start - but something to 
build on.


> Unit tests for LicenseCheckTask.
> 
>
> Key: LUCENE-5210
> URL: https://issues.apache.org/jira/browse/LUCENE-5210
> Project: Lucene - Core
>  Issue Type: Test
>  Components: general/build
>Reporter: Mark Miller
> Attachments: LUCENE-5210.patch
>
>
> While working on LUCENE-5209, I noticed the LicenseCheckTask is kind of a 
> second class citizen - excluded from UI src folder setup and with no units 
> tests. This was a little scary to me.
> I've started adding some units tests. So far I have mainly just done the 
> lifting of getting units tests to work as part of tools.
> I have added two super simple tests - really just the start - but something 
> to build on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5210) Unit tests for LicenseCheckTask.

2013-09-13 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-5210:


Attachment: LUCENE-5210.patch

> Unit tests for LicenseCheckTask.
> 
>
> Key: LUCENE-5210
> URL: https://issues.apache.org/jira/browse/LUCENE-5210
> Project: Lucene - Core
>  Issue Type: Test
>  Components: general/build
>Reporter: Mark Miller
> Attachments: LUCENE-5210.patch
>
>
> While working on LUCENE-5209, I noticed the LicenseCheckerTask is kind of a 
> second class citizen - excluded from UI src folder setup and with no units 
> tests. This was a little scary to me.
> I've started adding some units tests. So far I have mainly just done the 
> lifting of getting units tests to work as part of tools.
> I have added two super simple tests - really just the start - but something 
> to build on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5210) Unit tests for LicenseCheckTask.

2013-09-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766992#comment-13766992
 ] 

Mark Miller commented on LUCENE-5210:
-

Also note: the test.jar files (fake) that belong in the test resources folder 
did not get picked up by the patch.

> Unit tests for LicenseCheckTask.
> 
>
> Key: LUCENE-5210
> URL: https://issues.apache.org/jira/browse/LUCENE-5210
> Project: Lucene - Core
>  Issue Type: Test
>  Components: general/build
>Reporter: Mark Miller
> Attachments: LUCENE-5210.patch
>
>
> While working on LUCENE-5209, I noticed the LicenseCheckTask is kind of a 
> second class citizen - excluded from UI src folder setup and with no units 
> tests. This was a little scary to me.
> I've started adding some units tests. So far I have mainly just done the 
> lifting of getting units tests to work as part of tools.
> I have added two super simple tests - really just the start - but something 
> to build on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5238) Update the .css for the Ref Guide

2013-09-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-5238.


Resolution: Fixed
  Assignee: Hoss Man

I've updated this in cwiki.

Casandra: if you notice any glitches that need tweaked, please re-open.

> Update the .css for the Ref Guide
> -
>
> Key: SOLR-5238
> URL: https://issues.apache.org/jira/browse/SOLR-5238
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 4.5
>
> Attachments: SolrRefGuide.css
>
>
> I put a custom .css into the Ref Guide before it was uploaded. I was going 
> for something parallel to the Solr website, but only spent a little time with 
> it. In terms of readibility of the text online, it's not great, which is 
> putting it nicely. It's also very difficult to differentiate between "normal" 
> text and monospaced text to indicate a command, program name, etc.
> I'm attaching a new .css that can simply replace what's already in the Space 
> Admin -> Stylesheet section. I did several things with this:
> * cleaned up the .css generally, consolidated some repetitive sections, and 
> added more comments in case future changes are desired.
> * changed the font throughout to Helvetica, Arial, or sans-serif and updated 
> the color to a slightly less strong black.
> * changed the monospace font to match the font used in the code boxes 
> (Consolas) and made them the same color as the text (default is a lot 
> lighter).
> * added a bit more space between lines.
> * removed the negative margin in the header/breadcrumbs to give it a bit more 
> space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5167) Ability to use AnalyzingInfixSuggester in Solr

2013-09-13 Thread Areek Zillur (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Areek Zillur updated SOLR-5167:
---

Attachment: SOLR-5167.patch

Proper way to use AnalzingInfixSuggester in Solr + tests for the new 
LookupFactory

> Ability to use AnalyzingInfixSuggester in Solr
> --
>
> Key: SOLR-5167
> URL: https://issues.apache.org/jira/browse/SOLR-5167
> Project: Solr
>  Issue Type: New Feature
>  Components: SearchComponents - other
>Reporter: Varun Thacker
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5167.patch, SOLR-5167.patch
>
>
> We should be able to use AnalyzingInfixSuggester in Solr by defining it in 
> solrconfig.xml

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5210) Unit tests for LicenseCheckTask.

2013-09-13 Thread Mark Miller (JIRA)

Mark Miller created LUCENE-5210:
---

 Summary: Unit tests for LicenseCheckTask.
 Key: LUCENE-5210
 URL: https://issues.apache.org/jira/browse/LUCENE-5210
 Project: Lucene - Core
  Issue Type: Test
  Components: general/build
Reporter: Mark Miller


While working on LUCENE-5209, I noticed the LicenseCheckerTask is kind of a 
second class citizen - excluded from UI src folder setup and with no units 
tests. This was a little scary to me.

I've started adding some units tests. So far I have mainly just done the 
lifting of getting units tests to work as part of tools.

I have added two super simple tests - really just the start - but something to 
build on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Chris Hostetter


: If you want to experiment, a really trivial test is below -- on my system, 
: there is a ~5 second gap between each pair of "INIT" and "H1" timestamps, 
: but no other odd gaps in subsequent timestamps -- suggesting no caching of 
: DNS per hostname, but that the URL class doesn't "re-lookup" on subsequent 
: hashCode() calls.

Or maybe i could actually cut & paste the test this time...


import java.net.URL;
import org.apache.lucene.util.LuceneTestCase;
public class TestURLDNSAbsurdity extends LuceneTestCase {
  
  public void testHowSlowCanYouGo() throws Exception {
go("1");
go("2");
  }
  public void go(String s) throws Exception {
System.out.println(s + " PRE: " + System.currentTimeMillis());
URL url = new URL("http://example.com/";);
System.out.println(s + "INIT: " + System.currentTimeMillis());
int h1 = url.hashCode();
System.out.println(s + "  H1: " + System.currentTimeMillis());
int h2 = url.hashCode();
System.out.println(s + "  H2: " + System.currentTimeMillis());
boolean e1 = url.equals(this);
System.out.println(s + "  E1: " + System.currentTimeMillis());
boolean e2 = url.equals(this);
System.out.println(s + "  E2: " + System.currentTimeMillis());
assertEquals(h1,h2);
assertEquals(e1,e2);
  }
}

...

   [junit4] Started J0 PID(31843@frisbee).
   [junit4] Suite: org.apache.solr.util.TestURLDNSAbsurdity
   [junit4]   1> 1 PRE: 1379107971313
   [junit4]   1> 1INIT: 1379107971314
   [junit4]   1> 1  H1: 1379107976449
   [junit4]   1> 1  H2: 1379107976449
   [junit4]   1> 1  E1: 1379107976449
   [junit4]   1> 1  E2: 1379107976449
   [junit4]   1> 2 PRE: 1379107976450
   [junit4]   1> 2INIT: 1379107976450
   [junit4]   1> 2  H1: 1379107981510
   [junit4]   1> 2  H2: 1379107981510
   [junit4]   1> 2  E1: 1379107981510
   [junit4]   1> 2  E2: 1379107981510
   [junit4] OK  10.3s | TestURLDNSAbsurdity.testHowSlowCanYouGo




-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5210) Unit tests for LicenseCheckTask.

2013-09-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766983#comment-13766983
 ] 

Mark Miller commented on LUCENE-5210:
-

This patch is not complete, but to show the current direction. There will be at 
least one more.

> Unit tests for LicenseCheckTask.
> 
>
> Key: LUCENE-5210
> URL: https://issues.apache.org/jira/browse/LUCENE-5210
> Project: Lucene - Core
>  Issue Type: Test
>  Components: general/build
>Reporter: Mark Miller
> Attachments: LUCENE-5210.patch
>
>
> While working on LUCENE-5209, I noticed the LicenseCheckTask is kind of a 
> second class citizen - excluded from UI src folder setup and with no units 
> tests. This was a little scary to me.
> I've started adding some units tests. So far I have mainly just done the 
> lifting of getting units tests to work as part of tools.
> I have added two super simple tests - really just the start - but something 
> to build on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Chris Hostetter


: I still don't see why you'd get different timings between Eclipse and
: Ant if you're running with the same VM -- it should be pretty
: consistent (either it caches dns lookups or it doesn't).

Maybe Eclipse mucks with networkaddress.cache.ttl & 
networkaddress.cache.negative.ttl ?

If you want to experiment, a really trivial test is below -- on my system, 
there is a ~5 second gap between each pair of "INIT" and "H1" timestamps, 
but no other odd gaps in subsequent timestamps -- suggesting no caching of 
DNS per hostname, but that the URL class doesn't "re-lookup" on subsequent 
hashCode() calls.

-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Dawid Weiss

They should actually be faster in Ant than they are in Eclipse
(because Eclipse displays logs in the console and Ant flushes them to
a file)?

Dawid

On Fri, Sep 13, 2013 at 10:29 PM, Robert Muir  wrote:
> Loggers
>
> On Fri, Sep 13, 2013 at 4:22 PM, Dawid Weiss
>  wrote:
>> I still don't see why you'd get different timings between Eclipse and
>> Ant if you're running with the same VM -- it should be pretty
>> consistent (either it caches dns lookups or it doesn't).
>>
>> D.
>>
>>
>> On Fri, Sep 13, 2013 at 10:01 PM, Shai Erera  wrote:
>>> With the patch on SOLR-5241, the test runs for 1.8-3.2s, so this seems to
>>> solve the problem.
>>>
>>> I'm running w/ IBM J9 1.7.0, but it also happens w/ Oracle 1.7.0 (seems to
>>> be even slower!), in both Ant and Eclipse.
>>>
>>> You can reproduce by running "ant test -Dtestcase=SimplePostToolTest" from
>>> /solr/core. No specific seed as it's consistently slow without the patch,
>>> and fast with it.
>>>
>>> Thanks Hoss!
>>>
>>> Shai
>>>
>>>
>>> On Fri, Sep 13, 2013 at 10:02 PM, Dawid Weiss 
>>> wrote:

 Does this test actually try to connect to those URLs? If not then a
 fake:// URL handler would be a very elegant solution not reaching to
 the DNS subsystem at all? Not that I want to write it -- I remember it
 was kind of nightmarish :)

 Dawid

 On Fri, Sep 13, 2013 at 8:48 PM, Chris Hostetter
  wrote:
 >
 > : and changing the SimplePostTool instances to static and switching the
 > : @Before to @BeforeClass causes the whole tests runtime to drop down to
 > 50
 > : seconds for me.
 >
 > i couldn't leave that change in because it introduced failures depending
 > on test ordering (i thought those SimplePostTool objects were treated as
 > immutible, but i was wrong)
 >
 > however: switching all usages of "example.com" to a "127.42.42.42" seems
 > to have fixed things.
 >
 > Shai: can you confirm this patch resolves things for you...
 >
 > https://issues.apache.org/jira/browse/SOLR-5241
 >
 >
 > -Hoss
 >
 > -
 > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 > For additional commands, e-mail: dev-h...@lucene.apache.org
 >

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Dawid Weiss

I still don't see why you'd get different timings between Eclipse and
Ant if you're running with the same VM -- it should be pretty
consistent (either it caches dns lookups or it doesn't).

D.


On Fri, Sep 13, 2013 at 10:01 PM, Shai Erera  wrote:
> With the patch on SOLR-5241, the test runs for 1.8-3.2s, so this seems to
> solve the problem.
>
> I'm running w/ IBM J9 1.7.0, but it also happens w/ Oracle 1.7.0 (seems to
> be even slower!), in both Ant and Eclipse.
>
> You can reproduce by running "ant test -Dtestcase=SimplePostToolTest" from
> /solr/core. No specific seed as it's consistently slow without the patch,
> and fast with it.
>
> Thanks Hoss!
>
> Shai
>
>
> On Fri, Sep 13, 2013 at 10:02 PM, Dawid Weiss 
> wrote:
>>
>> Does this test actually try to connect to those URLs? If not then a
>> fake:// URL handler would be a very elegant solution not reaching to
>> the DNS subsystem at all? Not that I want to write it -- I remember it
>> was kind of nightmarish :)
>>
>> Dawid
>>
>> On Fri, Sep 13, 2013 at 8:48 PM, Chris Hostetter
>>  wrote:
>> >
>> > : and changing the SimplePostTool instances to static and switching the
>> > : @Before to @BeforeClass causes the whole tests runtime to drop down to
>> > 50
>> > : seconds for me.
>> >
>> > i couldn't leave that change in because it introduced failures depending
>> > on test ordering (i thought those SimplePostTool objects were treated as
>> > immutible, but i was wrong)
>> >
>> > however: switching all usages of "example.com" to a "127.42.42.42" seems
>> > to have fixed things.
>> >
>> > Shai: can you confirm this patch resolves things for you...
>> >
>> > https://issues.apache.org/jira/browse/SOLR-5241
>> >
>> >
>> > -Hoss
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Joins on the confluence wiki

2013-09-13 Thread David Smiley (@MITRE.org)

Erick,

Kranti referred to the *Confluence* wiki, in other words, the Solr Reference
Guide.  Am I correct in that only committers can have write access to that?:
https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation#Internal-MaintainingDocumentation-WhoCanEditThisDocumentation

~ David


Erick Erickson wrote
> Just let us know your Wiki user ID and we'll add you
> to the approved list right away.
> 
> Had some trouble with spam bots a while back so had to go
> this route.
> 
> Thanks for volunteering to help!
> 
> Erick
> 
> 
> On Thu, Sep 12, 2013 at 9:16 PM, Kranti Parisa <

> kranti.parisa@

> >wrote:
> 
>> Guys,
>>
>> Seems there is not wiki page for Joins. I have been using/working Joins
>> and I want to start writing a page for the same on the Confluence wiki.
>> How
>> can I get access for adding/editing the wiki pages?
>>
>> Thanks & Regards,
>> Kranti K Parisa
>> http://www.linkedin.com/in/krantiparisa
>>
>>





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Joins-on-the-confluence-wiki-tp4089757p4089939.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Robert Muir

Loggers

On Fri, Sep 13, 2013 at 4:22 PM, Dawid Weiss
 wrote:
> I still don't see why you'd get different timings between Eclipse and
> Ant if you're running with the same VM -- it should be pretty
> consistent (either it caches dns lookups or it doesn't).
>
> D.
>
>
> On Fri, Sep 13, 2013 at 10:01 PM, Shai Erera  wrote:
>> With the patch on SOLR-5241, the test runs for 1.8-3.2s, so this seems to
>> solve the problem.
>>
>> I'm running w/ IBM J9 1.7.0, but it also happens w/ Oracle 1.7.0 (seems to
>> be even slower!), in both Ant and Eclipse.
>>
>> You can reproduce by running "ant test -Dtestcase=SimplePostToolTest" from
>> /solr/core. No specific seed as it's consistently slow without the patch,
>> and fast with it.
>>
>> Thanks Hoss!
>>
>> Shai
>>
>>
>> On Fri, Sep 13, 2013 at 10:02 PM, Dawid Weiss 
>> wrote:
>>>
>>> Does this test actually try to connect to those URLs? If not then a
>>> fake:// URL handler would be a very elegant solution not reaching to
>>> the DNS subsystem at all? Not that I want to write it -- I remember it
>>> was kind of nightmarish :)
>>>
>>> Dawid
>>>
>>> On Fri, Sep 13, 2013 at 8:48 PM, Chris Hostetter
>>>  wrote:
>>> >
>>> > : and changing the SimplePostTool instances to static and switching the
>>> > : @Before to @BeforeClass causes the whole tests runtime to drop down to
>>> > 50
>>> > : seconds for me.
>>> >
>>> > i couldn't leave that change in because it introduced failures depending
>>> > on test ordering (i thought those SimplePostTool objects were treated as
>>> > immutible, but i was wrong)
>>> >
>>> > however: switching all usages of "example.com" to a "127.42.42.42" seems
>>> > to have fixed things.
>>> >
>>> > Shai: can you confirm this patch resolves things for you...
>>> >
>>> > https://issues.apache.org/jira/browse/SOLR-5241
>>> >
>>> >
>>> > -Hoss
>>> >
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5241:
---

Attachment: SOLR-5241.patch

FYI: today is the first time i've looked at this test before, so take all of my 
comments with a grain of salt...

bq. why is it trying to resolve the host? Is it so that it can then try to 
connect to it and the test expects that this will fail?

>From what i can tell, *nothing* in the test, that i can see, is trying to 
>resolve the hostname or IP or connect to any of these URLs.  A 
>"MockPageFetcher" is explicitly plugged into the SimplePostTool when used in 
>the test to mock out the HTTP communication to prevent this.

The problem (again: as far as i can tell) is entirely because the tests (and 
underlying code in SimplePostTool) use the java.net.URL class, which can/will 
attempt hostname resolution of DNS names used in URLs under the covers in some 
cases -- notable anytime the URL.equals() or URL.hashCode methods are called.

bq. use ips like [ff01::114] instead.

I've got no problem doing that if you think it makes a diff -- but just so i 
understand: can you explain why that is better then 127.42.42.42 ?

bq. if you really just need a URL, why not use file://

That might work, but the SimplePostTool actually supports diff options for 
dealing with local files vs remote web urls, and the test's MockPageFetcher 
actually simulates servers that response with diff HTTP status codes, so i'm 
not sure if using "file://" will work and/or test the same things.



attaching an updated patch using "[ff01::114]" instead of "127.42.42.42" per 
rmuir's request.

[~shaie]: does this still run "fast" for you?

any objections from anyone to be committing this?

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch, SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766877#comment-13766877
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523075 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523075 ]

LUCENE-5207: add comment that the regen hack does not work in Java 8

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Shai Erera

With the patch on SOLR-5241, the test runs for 1.8-3.2s, so this seems to
solve the problem.

I'm running w/ IBM J9 1.7.0, but it also happens w/ Oracle 1.7.0 (seems to
be even slower!), in both Ant and Eclipse.

You can reproduce by running "ant test -Dtestcase=SimplePostToolTest" from
/solr/core. No specific seed as it's consistently slow without the patch,
and fast with it.

Thanks Hoss!

Shai


On Fri, Sep 13, 2013 at 10:02 PM, Dawid Weiss
wrote:

> Does this test actually try to connect to those URLs? If not then a
> fake:// URL handler would be a very elegant solution not reaching to
> the DNS subsystem at all? Not that I want to write it -- I remember it
> was kind of nightmarish :)
>
> Dawid
>
> On Fri, Sep 13, 2013 at 8:48 PM, Chris Hostetter
>  wrote:
> >
> > : and changing the SimplePostTool instances to static and switching the
> > : @Before to @BeforeClass causes the whole tests runtime to drop down to
> 50
> > : seconds for me.
> >
> > i couldn't leave that change in because it introduced failures depending
> > on test ordering (i thought those SimplePostTool objects were treated as
> > immutible, but i was wrong)
> >
> > however: switching all usages of "example.com" to a "127.42.42.42" seems
> > to have fixed things.
> >
> > Shai: can you confirm this patch resolves things for you...
> >
> > https://issues.apache.org/jira/browse/SOLR-5241
> >
> >
> > -Hoss
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766874#comment-13766874
 ] 

Shai Erera commented on SOLR-5241:
--

With this patch, the test runs for 1.8-3.2s (varies, but *much* faster than 
before).

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766840#comment-13766840
 ] 

Robert Muir commented on SOLR-5241:
---

why is it trying to resolve the host? Is it so that it can then try to connect 
to it and the test expects that this will fail?

Its the latter part that will cause the issue: use ips like [ff01::114] instead.

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766811#comment-13766811
 ] 

Robert Muir commented on SOLR-5241:
---

Won't this still be an issue for jenkins runs because even loopback addresses 
are blackholed?


> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Dawid Weiss

Does this test actually try to connect to those URLs? If not then a
fake:// URL handler would be a very elegant solution not reaching to
the DNS subsystem at all? Not that I want to write it -- I remember it
was kind of nightmarish :)

Dawid

On Fri, Sep 13, 2013 at 8:48 PM, Chris Hostetter
 wrote:
>
> : and changing the SimplePostTool instances to static and switching the
> : @Before to @BeforeClass causes the whole tests runtime to drop down to 50
> : seconds for me.
>
> i couldn't leave that change in because it introduced failures depending
> on test ordering (i thought those SimplePostTool objects were treated as
> immutible, but i was wrong)
>
> however: switching all usages of "example.com" to a "127.42.42.42" seems
> to have fixed things.
>
> Shai: can you confirm this patch resolves things for you...
>
> https://issues.apache.org/jira/browse/SOLR-5241
>
>
> -Hoss
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766835#comment-13766835
 ] 

Dawid Weiss commented on SOLR-5241:
---

I'd have to check, I don't remember. Looking at this it seems you could write a 
custom parsing routine --
http://docs.oracle.com/javase/7/docs/api/java/net/URLStreamHandler.html#parseURL(java.net.URL,
 java.lang.String, int, int)

but this may be an overkill.

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766818#comment-13766818
 ] 

Dawid Weiss commented on SOLR-5241:
---

Yeah... maybe that fake:// protocol handler is actually a sensible idea for 
such tests.

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766844#comment-13766844
 ] 

Robert Muir commented on SOLR-5241:
---

and if you really just need a URL, why not use file://

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766827#comment-13766827
 ] 

Hoss Man commented on SOLR-5241:


bq. maybe that fake:// protocol handler is actually a sensible idea for such 
tests.

I'm not sure, but i think the URL class would still try to resolve the hostname 
portion of the URL, even if we registered our own fake protocol.

---

I really don't see how the blackhole could affect this even if it did block dns 
lookups, since no lookup should ever happen with an ip specified, but if it 
does then i think the whole test just needs re-written not to use the URL class.


> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766823#comment-13766823
 ] 

Hoss Man commented on SOLR-5241:


I don't think so, but maybe i don't fully understand what the FreeBSD blackhole 
does.

The test never attempts to open any sockets to these URL objects -- the problem 
so far (that i can see) is just that by nature of being java.net.URL, there is 
a DNS check when the equals/hashCode methods get used and that seems to be the 
speed problem when the urls contain "example.com" ... so i figured using a 
"safe" IP would prevent that...

http://www.eishay.com/2008/04/javas-url-little-secret.html

is there any reason the freebsd blackhole would affect dns lookups on 
"127.42.42.42" even if the URL class did decide to try to "resolve" that IP as 
a hostname (i don't think it does) ?

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766797#comment-13766797
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523059 from [~rcmuir] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523059 ]

LUCENE-5207: enforce encoding and locale (for paranoia reasons)

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5209) Allow the license checker to optionally avoid check sum comparisons on SNAPSHOT dependencies.

2013-09-13 Thread Mark Miller (JIRA)

Mark Miller created LUCENE-5209:
---

 Summary: Allow the license checker to optionally avoid check sum 
comparisons on SNAPSHOT dependencies.
 Key: LUCENE-5209
 URL: https://issues.apache.org/jira/browse/LUCENE-5209
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, 4.6


SNAPSHOT's cannot actually be used and released by Lucene/Solr, but we use them 
downstream in some cases during development - we have to harmonize jars across 
multiple projects.

It would be nice if we could avoid doing the check sum check on SNAPSHOT's, but 
still do the license check (dev adds any dependency, dev must add license 
immediately).

This first patch adds a new system property called skipSnapshotsChecksum - if 
you set it to true, SNAPSHOT dependency's will not be check sum compared.

I think this change makes the license checker more consumable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766760#comment-13766760
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523047 from [~rcmuir] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523047 ]

LUCENE-5207: upgrade checksum/maven

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Chris Hostetter


: and changing the SimplePostTool instances to static and switching the 
: @Before to @BeforeClass causes the whole tests runtime to drop down to 50 
: seconds for me.

i couldn't leave that change in because it introduced failures depending 
on test ordering (i thought those SimplePostTool objects were treated as 
immutible, but i was wrong)

however: switching all usages of "example.com" to a "127.42.42.42" seems 
to have fixed things.

Shai: can you confirm this patch resolves things for you...

https://issues.apache.org/jira/browse/SOLR-5241


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5209) Allow the license checker to optionally avoid check sum comparisons on SNAPSHOT dependencies.

2013-09-13 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-5209:


Attachment: LUCENE-5209.patch

> Allow the license checker to optionally avoid check sum comparisons on 
> SNAPSHOT dependencies.
> -
>
> Key: LUCENE-5209
> URL: https://issues.apache.org/jira/browse/LUCENE-5209
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-5209.patch
>
>
> SNAPSHOT's cannot actually be used and released by Lucene/Solr, but we use 
> them downstream in some cases during development - we have to harmonize jars 
> across multiple projects.
> It would be nice if we could avoid doing the check sum check on SNAPSHOT's, 
> but still do the license check (dev adds any dependency, dev must add license 
> immediately).
> This first patch adds a new system property called skipSnapshotsChecksum - if 
> you set it to true, SNAPSHOT dependency's will not be check sum compared.
> I think this change makes the license checker more consumable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5241:
---

Attachment: SOLR-5241.patch

patch switching all usage of example.com to a lookback IP.

this makes the entire test class take 2 seconds on my system (as 6 minutes 
before this)

> SimplePostToolTest is slow on some systmes - likely due to hostname 
> resolution of "example.com"
> ---
>
> Key: SOLR-5241
> URL: https://issues.apache.org/jira/browse/SOLR-5241
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Attachments: SOLR-5241.patch
>
>
> As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
> slow when he ran from ant, but only takes 1 second in his IDE.
> problem seems to be relate to the URL class attempting to response 
> "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766751#comment-13766751
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523042 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523042 ]

LUCENE-5207: Update to antlr 3.5 (which produces no warnings while compiling 
with java 7). Also fix the regen-macro to handle windows file paths while 
replacing

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5241) SimplePostToolTest is slow on some systmes - likely due to hostname resolution of "example.com"

2013-09-13 Thread Hoss Man (JIRA)

Hoss Man created SOLR-5241:
--

 Summary: SimplePostToolTest is slow on some systmes - likely due 
to hostname resolution of "example.com"
 Key: SOLR-5241
 URL: https://issues.apache.org/jira/browse/SOLR-5241
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man


As noted by Shai on the dev @lucene list, SimplePostToolTest is ridiculously 
slow when he ran from ant, but only takes 1 second in his IDE.

problem seems to be relate to the URL class attempting to response "example.com"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766785#comment-13766785
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523057 from [~rcmuir] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523057 ]

LUCENE-5207: try a hack around antlr hashmap bugs

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5209) Allow the license checker to optionally avoid check sum comparisons on SNAPSHOT dependencies.

2013-09-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766753#comment-13766753
 ] 

Robert Muir commented on LUCENE-5209:
-

+1

> Allow the license checker to optionally avoid check sum comparisons on 
> SNAPSHOT dependencies.
> -
>
> Key: LUCENE-5209
> URL: https://issues.apache.org/jira/browse/LUCENE-5209
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-5209.patch
>
>
> SNAPSHOT's cannot actually be used and released by Lucene/Solr, but we use 
> them downstream in some cases during development - we have to harmonize jars 
> across multiple projects.
> It would be nice if we could avoid doing the check sum check on SNAPSHOT's, 
> but still do the license check (dev adds any dependency, dev must add license 
> immediately).
> This first patch adds a new system property called skipSnapshotsChecksum - if 
> you set it to true, SNAPSHOT dependency's will not be check sum compared.
> I think this change makes the license checker more consumable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766757#comment-13766757
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523046 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523046 ]

LUCENE-5207: replace tabs by 2 spaces now. antlr 3.5 produces tabs consistently 
now, so we can replace them (no mixed tabs anymore)

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Chris Hostetter


: Am I the only one that experiences this slowness?

I can reproduce the speeds you are seeing from ant (no IDE to test from on 
my end)

It looks like some sort of delay in init ... every test has ~25s delay, 
and changing the SimplePostTool instances to static and switching the 
@Before to @BeforeClass causes the whole tests runtime to drop down to 50 
seconds for me.

I bet this is something related to hostname resolution with the 
"example.com" domain used in this test ... a MockPageFetcher is used hat 
never hits the URLs in question, but i bet somewhere in init they are 
still getting resolved.

I'll keep digging.


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Jack Krupansky

Sounds more like a timeout. I mean, posting documents should be reasonably 
fast.


-- Jack Krupansky

-Original Message- 
From: Michael McCandless

Sent: Friday, September 13, 2013 7:04 AM
To: Lucene/Solr dev
Subject: Re: SimplePostToolTest very slow

It passes when run from Eclipse?  It seems crazy that it can take 280
sec from ant but 1 sec from Eclipse.  Maybe it's not actually running,
when running from Eclipse?

+1 for @Nightly it we can't get to the bottom of it ...

Mike McCandless

http://blog.mikemccandless.com


On Thu, Sep 12, 2013 at 5:03 PM, Shai Erera  wrote:

Hi

I was running Solr tests now and thought they hung, but eventually they
continued and I noticed that SimplePostToolTest took 280s to complete. I
tried from eclipse, and it took 1s. Tried from Ant again, 276s. I compared
(briefly) the outputs of the test from eclipse and Ant, and they look
similar.

Is this expected? Maybe when the test runs from Ant it does more work 
(i.e.

system properties that are sent from build.xml but not in eclipse cause it
to index more data or something?). If it helps, here's what the test 
prints:


   [junit4] Suite: org.apache.solr.util.SimplePostToolTest
   [junit4]   2> log4j:WARN No such property [conversionPattern] in
org.apache.solr.util.SolrLogLayout.
   [junit4]   2> 396 T24 oas.SolrTestCaseJ4.setUp ###Starting testIsOn
   [junit4]   2> 23126 T24 oas.SolrTestCaseJ4.tearDown ###Ending testIsOn
   [junit4] OK  22.8s | SimplePostToolTest.testIsOn
   [junit4]   2> 23163 T24 oas.SolrTestCaseJ4.setUp ###Starting
testAppendUrlPath
   [junit4]   2> 54667 T24 oas.SolrTestCaseJ4.tearDown ###Ending
testAppendUrlPath
   [junit4] OK  31.5s | SimplePostToolTest.testAppendUrlPath
   [junit4]   2> 54682 T24 oas.SolrTestCaseJ4.setUp ###Starting
testGuessType
   [junit4]   2> 77185 T24 oas.SolrTestCaseJ4.tearDown ###Ending
testGuessType
   [junit4] OK  22.5s | SimplePostToolTest.testGuessType
   [junit4]   2> 77198 T24 oas.SolrTestCaseJ4.setUp ###Starting
testTypeSupported
   [junit4]   2> 99701 T24 oas.SolrTestCaseJ4.tearDown ###Ending
testTypeSupported
   [junit4] OK  22.5s | SimplePostToolTest.testTypeSupported
   [junit4]   2> 99712 T24 oas.SolrTestCaseJ4.setUp ###Starting
testRobotsExclusion
   [junit4]   2> 122214 T24 oas.SolrTestCaseJ4.tearDown ###Ending
testRobotsExclusion
   [junit4] OK  22.5s | SimplePostToolTest.testRobotsExclusion
   [junit4]   2> 15 T24 oas.SolrTestCaseJ4.setUp ###Starting
testParseArgsAndInit
   [junit4]   2> 144727 T24 oas.SolrTestCaseJ4.tearDown ###Ending
testParseArgsAndInit
   [junit4] OK  22.5s | SimplePostToolTest.testParseArgsAndInit
   [junit4]   2> 144736 T24 oas.SolrTestCaseJ4.setUp ###Starting
testDoWebMode
   [junit4]   2> SimplePostTool: WARNING: The URL
http://example.com/disallowed returned a HTTP result status of 403
   [junit4]   2> 185795 T24 oas.SolrTestCaseJ4.tearDown ###Ending
testDoWebMode
   [junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
   [junit4]   1> POSTed web resource http://example.com (depth: 0)
   [junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
   [junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
   [junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
   [junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
   [junit4]   1> POSTed web resource http://example.com/page1/foo (depth: 
2)

   [junit4]   1> Entering crawl at level 3 (1 links total, 1 new)
   [junit4]   1> POSTed web resource http://example.com/page1/foo/bar
(depth: 3)
   [junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
   [junit4]   1> POSTed web resource http://example.com (depth: 0)
   [junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
   [junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
   [junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
   [junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
   [junit4]   1> POSTed web resource http://example.com (depth: 0)
   [junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
   [junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
   [junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
   [junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
   [junit4]   1> POSTed web resource http://example.com/disallowed (depth:
2)
   [junit4]   1> POSTed web resource http://example.com/page1/foo (depth: 
2)

   [junit4]   1> Entering crawl at level 3 (1 links total, 1 new)
   [junit4]   1> POSTed web resource http://example.com/page1/foo/bar
(depth: 3)
   [junit4] OK  41.1s | SimplePostToolTest.testDoWebMode
   [junit4]   2> 185806 T24 oas.SolrTestCaseJ4.setUp ###Starting
testAppendParam
   [junit4]   2> 208310 T24 oas.SolrTestCaseJ4.tearDown ###Ending
testAppendParam
   [junit4] OK  22.5s | SimplePostToolTest.testAppendParam
   [junit4]   2> 208320 T24 oas.SolrTestCaseJ4.setUp ###Starting
testComputeFullUrl
   [juni

Re: SimplePostToolTest very slow

2013-09-13 Thread Dawid Weiss

Is it with the same seed?

Dawid

On Fri, Sep 13, 2013 at 1:04 PM, Michael McCandless
 wrote:
> It passes when run from Eclipse?  It seems crazy that it can take 280
> sec from ant but 1 sec from Eclipse.  Maybe it's not actually running,
> when running from Eclipse?
>
> +1 for @Nightly it we can't get to the bottom of it ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, Sep 12, 2013 at 5:03 PM, Shai Erera  wrote:
>> Hi
>>
>> I was running Solr tests now and thought they hung, but eventually they
>> continued and I noticed that SimplePostToolTest took 280s to complete. I
>> tried from eclipse, and it took 1s. Tried from Ant again, 276s. I compared
>> (briefly) the outputs of the test from eclipse and Ant, and they look
>> similar.
>>
>> Is this expected? Maybe when the test runs from Ant it does more work (i.e.
>> system properties that are sent from build.xml but not in eclipse cause it
>> to index more data or something?). If it helps, here's what the test prints:
>>
>>[junit4] Suite: org.apache.solr.util.SimplePostToolTest
>>[junit4]   2> log4j:WARN No such property [conversionPattern] in
>> org.apache.solr.util.SolrLogLayout.
>>[junit4]   2> 396 T24 oas.SolrTestCaseJ4.setUp ###Starting testIsOn
>>[junit4]   2> 23126 T24 oas.SolrTestCaseJ4.tearDown ###Ending testIsOn
>>[junit4] OK  22.8s | SimplePostToolTest.testIsOn
>>[junit4]   2> 23163 T24 oas.SolrTestCaseJ4.setUp ###Starting
>> testAppendUrlPath
>>[junit4]   2> 54667 T24 oas.SolrTestCaseJ4.tearDown ###Ending
>> testAppendUrlPath
>>[junit4] OK  31.5s | SimplePostToolTest.testAppendUrlPath
>>[junit4]   2> 54682 T24 oas.SolrTestCaseJ4.setUp ###Starting
>> testGuessType
>>[junit4]   2> 77185 T24 oas.SolrTestCaseJ4.tearDown ###Ending
>> testGuessType
>>[junit4] OK  22.5s | SimplePostToolTest.testGuessType
>>[junit4]   2> 77198 T24 oas.SolrTestCaseJ4.setUp ###Starting
>> testTypeSupported
>>[junit4]   2> 99701 T24 oas.SolrTestCaseJ4.tearDown ###Ending
>> testTypeSupported
>>[junit4] OK  22.5s | SimplePostToolTest.testTypeSupported
>>[junit4]   2> 99712 T24 oas.SolrTestCaseJ4.setUp ###Starting
>> testRobotsExclusion
>>[junit4]   2> 122214 T24 oas.SolrTestCaseJ4.tearDown ###Ending
>> testRobotsExclusion
>>[junit4] OK  22.5s | SimplePostToolTest.testRobotsExclusion
>>[junit4]   2> 15 T24 oas.SolrTestCaseJ4.setUp ###Starting
>> testParseArgsAndInit
>>[junit4]   2> 144727 T24 oas.SolrTestCaseJ4.tearDown ###Ending
>> testParseArgsAndInit
>>[junit4] OK  22.5s | SimplePostToolTest.testParseArgsAndInit
>>[junit4]   2> 144736 T24 oas.SolrTestCaseJ4.setUp ###Starting
>> testDoWebMode
>>[junit4]   2> SimplePostTool: WARNING: The URL
>> http://example.com/disallowed returned a HTTP result status of 403
>>[junit4]   2> 185795 T24 oas.SolrTestCaseJ4.tearDown ###Ending
>> testDoWebMode
>>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
>>[junit4]   1> POSTed web resource http://example.com (depth: 0)
>>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
>>[junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
>>[junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
>>[junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
>>[junit4]   1> POSTed web resource http://example.com/page1/foo (depth: 2)
>>[junit4]   1> Entering crawl at level 3 (1 links total, 1 new)
>>[junit4]   1> POSTed web resource http://example.com/page1/foo/bar
>> (depth: 3)
>>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
>>[junit4]   1> POSTed web resource http://example.com (depth: 0)
>>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
>>[junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
>>[junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
>>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
>>[junit4]   1> POSTed web resource http://example.com (depth: 0)
>>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
>>[junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
>>[junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
>>[junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
>>[junit4]   1> POSTed web resource http://example.com/disallowed (depth:
>> 2)
>>[junit4]   1> POSTed web resource http://example.com/page1/foo (depth: 2)
>>[junit4]   1> Entering crawl at level 3 (1 links total, 1 new)
>>[junit4]   1> POSTed web resource http://example.com/page1/foo/bar
>> (depth: 3)
>>[junit4] OK  41.1s | SimplePostToolTest.testDoWebMode
>>[junit4]   2> 185806 T24 oas.SolrTestCaseJ4.setUp ###Starting
>> testAppendParam
>>[junit4]   2> 208310 T24 oas.SolrTestCaseJ4.tearDown ###Ending
>> testAppendParam
>>[junit4] OK  22.5s | SimplePostToolTest.testAppendParam
>>

[jira] [Comment Edited] (SOLR-4787) Join Contrib

2013-09-13 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766446#comment-13766446
 ] 

Joel Bernstein edited comment on SOLR-4787 at 9/13/13 12:46 PM:


Kranti, the bjoin now supports multi-value fields. I'll work on getting the 
patch up here today.

  was (Author: joel.bernstein):
Kranti, the bjoin now supports multi-value joins. I'll work on getting the 
patch up here today.
  
> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *HashSetJoinQParserPlugin aka hjoin*
> The hjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the hjoin is designed to work with int and long join 
> keys only. So, in order to use hjoin, int or long join keys must be included 
> in both the to and from core.
> The second difference is that the hjoin builds memory structures that are 
> used to quickly connect the join keys. So, the hjoin will need more memory 
> then the JoinQParserPlugin to perform the join.
> The main advantage of the hjoin is that it can scale to join millions of keys 
> between cores and provide sub-second response time. The hjoin should work 
> well with up to two million results from the fromIndex and tens of millions 
> of results from the main query.
> The hjoin supports the following features:
> 1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will 
> turn on the PostFilter. The PostFilter will typically outperform the Lucene 
> query when the main query results have been narrowed down.
> 2) With the lucene query implementation there is an option to build the 
> filter with threads. This can greatly improve the performance of the query if 
> the main query index is very large. The "threads" parameter turns on 
> threading. For example *threads=6* will use 6 threads to build the filter. 
> This will setup a fixed threadpool with six threads to handle all hjoin 
> requests. Once the threadpool is created the hjoin will always use it to 
> build the filter. Threading does not come into play with the PostFilter.
> 3) The *size* local parameter can be used to set the initial size of the 
> hashset used to perform the join. If this is set above the number of results 
> from the fromIndex then the you can avoid hashset resizing which improves 
> performance.
> 4) Nested filter queries. The local parameter "fq" can be used to nest a 
> filter query within the join. The nested fq will filter the results of the 
> join query. This can point to another join to support nested joins.
> 5) Full caching support for the lucene query implementation. The filterCache 
> and queryResultCache should work properly even with deep nesting of joins. 
> Only the queryResultCache comes into play with the PostFilter implementation 
> because PostFilters are not cacheable in the filterCache.
> The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
> plugin is referenced by the string "hjoin" rather then "join".
> fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
> fq=$qq\}user:customer1&qq=group:5
> The example filter query above will search the fromIndex (collection2) for 
> "user:customer1" applying the local fq parameter to filter the results. The 
> lucene filter query will be built using 6 threads. This query will generate a 
> list of values from the "from" field that will be used to filter the main 
> query. Only records from the main query, where the "to" field is present in 
> the "from" list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.HashSetJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
>  
>  
> *BitSetJoinQParserPlugin aka bj

[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module

2013-09-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1375#comment-1375
 ] 

Mark Miller commented on LUCENE-2562:
-

FYI, I see a lot of the following type thing in the console:

"borderColor" is not a valid style for org.apache.pivot.wtk.TextArea
"activeBackgroundColor" is not a valid style for org.apache.pivot.wtk.TextArea

Duplicate listener org.apache.lucene.luke.ui.LukeWindow$2@4a9a1ac added to 
org.apache.pivot.wtk.Component$ComponentMouseListenerList 
[org.apache.pivot.wtk.skin.terra.TerraPushButtonSkin@451dfada, 
org.apache.lucene.luke.ui.LukeWindow$2@4a9a1ac]

Perhaps that is currently expected, but FYI.

> Make Luke a Lucene/Solr Module
> --
>
> Key: LUCENE-2562
> URL: https://issues.apache.org/jira/browse/LUCENE-2562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Mark Miller
>  Labels: gsoc2013
> Attachments: LUCENE-2562.patch, LUCENE-2562.patch, luke1.jpg, 
> luke2.jpg, luke3.jpg, Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, 
> Luke-ALE-4.png, Luke-ALE-5.png
>
>
> see
> "RE: Luke - in need of maintainer": 
> http://markmail.org/message/m4gsto7giltvrpuf
> "Web-based Luke": http://markmail.org/message/4xwps7p7ifltme5q
> I think it would be great if there was a version of Luke that always worked 
> with trunk - and it would also be great if it was easier to match Luke jars 
> with Lucene versions.
> While I'd like to get GWT Luke into the mix as well, I think the easiest 
> starting point is to straight port Luke to another UI toolkit before 
> abstracting out DTO objects that both GWT Luke and Pivot Luke could share.
> I've started slowly converting Luke's use of thinlet to Apache Pivot. I 
> haven't/don't have a lot of time for this at the moment, but I've plugged 
> away here and there over the past work or two. There is still a *lot* to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1373#comment-1373
 ] 

ASF subversion and git services commented on LUCENE-2562:
-

Commit 1523019 from [~markrmil...@gmail.com]
[ https://svn.apache.org/r1523019 ]

LUCENE-2562: Ajay Bhat

Support for 5 themes, through a recursive style change function
Exit option in File menu
Status bar
Analyzer tokenstream reset call
Documentation for above features

> Make Luke a Lucene/Solr Module
> --
>
> Key: LUCENE-2562
> URL: https://issues.apache.org/jira/browse/LUCENE-2562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Mark Miller
>  Labels: gsoc2013
> Attachments: LUCENE-2562.patch, LUCENE-2562.patch, luke1.jpg, 
> luke2.jpg, luke3.jpg, Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, 
> Luke-ALE-4.png, Luke-ALE-5.png
>
>
> see
> "RE: Luke - in need of maintainer": 
> http://markmail.org/message/m4gsto7giltvrpuf
> "Web-based Luke": http://markmail.org/message/4xwps7p7ifltme5q
> I think it would be great if there was a version of Luke that always worked 
> with trunk - and it would also be great if it was easier to match Luke jars 
> with Lucene versions.
> While I'd like to get GWT Luke into the mix as well, I think the easiest 
> starting point is to straight port Luke to another UI toolkit before 
> abstracting out DTO objects that both GWT Luke and Pivot Luke could share.
> I've started slowly converting Luke's use of thinlet to Apache Pivot. I 
> haven't/don't have a lot of time for this at the moment, but I've plugged 
> away here and there over the past work or two. There is still a *lot* to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4787) Join Contrib

2013-09-13 Thread Kranti Parisa (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766657#comment-13766657
 ] 

Kranti Parisa commented on SOLR-4787:
-

Yes, will first test the bjoin for multi-valued fields and then try to extend 
hjoin for multi-value fields.

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *HashSetJoinQParserPlugin aka hjoin*
> The hjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the hjoin is designed to work with int and long join 
> keys only. So, in order to use hjoin, int or long join keys must be included 
> in both the to and from core.
> The second difference is that the hjoin builds memory structures that are 
> used to quickly connect the join keys. So, the hjoin will need more memory 
> then the JoinQParserPlugin to perform the join.
> The main advantage of the hjoin is that it can scale to join millions of keys 
> between cores and provide sub-second response time. The hjoin should work 
> well with up to two million results from the fromIndex and tens of millions 
> of results from the main query.
> The hjoin supports the following features:
> 1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will 
> turn on the PostFilter. The PostFilter will typically outperform the Lucene 
> query when the main query results have been narrowed down.
> 2) With the lucene query implementation there is an option to build the 
> filter with threads. This can greatly improve the performance of the query if 
> the main query index is very large. The "threads" parameter turns on 
> threading. For example *threads=6* will use 6 threads to build the filter. 
> This will setup a fixed threadpool with six threads to handle all hjoin 
> requests. Once the threadpool is created the hjoin will always use it to 
> build the filter. Threading does not come into play with the PostFilter.
> 3) The *size* local parameter can be used to set the initial size of the 
> hashset used to perform the join. If this is set above the number of results 
> from the fromIndex then the you can avoid hashset resizing which improves 
> performance.
> 4) Nested filter queries. The local parameter "fq" can be used to nest a 
> filter query within the join. The nested fq will filter the results of the 
> join query. This can point to another join to support nested joins.
> 5) Full caching support for the lucene query implementation. The filterCache 
> and queryResultCache should work properly even with deep nesting of joins. 
> Only the queryResultCache comes into play with the PostFilter implementation 
> because PostFilters are not cacheable in the filterCache.
> The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
> plugin is referenced by the string "hjoin" rather then "join".
> fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
> fq=$qq\}user:customer1&qq=group:5
> The example filter query above will search the fromIndex (collection2) for 
> "user:customer1" applying the local fq parameter to filter the results. The 
> lucene filter query will be built using 6 threads. This query will generate a 
> list of values from the "from" field that will be used to filter the main 
> query. Only records from the main query, where the "to" field is present in 
> the "from" list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.HashSetJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
>  
>  
> *BitSetJoinQParserPlugin aka bjoin*
> The bjoin behaves exactly like the hjoin but uses a BitSet instead of a 
> HashSet to perform the underlying join. Because of this the bjoin is much 
> faster and can prov

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-13 Thread Jack Krupansky (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766448#comment-13766448
 ] 

Jack Krupansky commented on SOLR-1301:
--

Fix version still says 4.5.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 4.5, 5.0
>
> Attachments: commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SolrRecordWriter.java
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Joins on the confluence wiki

2013-09-13 Thread Cassandra Targett

Kanti, are you referring to the Solr Ref Guide (which is a Confluence
wiki)? I notice that there is a page already in the Solr wiki about
joins: http://wiki.apache.org/solr/Join, but not one in the Ref Guide
yet.

Policies for editing the Solr Ref Guide are different from the Solr
wiki, and are here:
https://cwiki.apache.org/confluence/display/solr/Internal+-+Maintaining+Documentation.

If you can't get access to create content for the Solr Ref Guide, you
can still make comments with suggestions for improvements. If you do
that, I'll be happy to add the new page and work with you to make it
right.

Cassandra

On Fri, Sep 13, 2013 at 8:06 AM, Erick Erickson  wrote:
> Just let us know your Wiki user ID and we'll add you
> to the approved list right away.
>
> Had some trouble with spam bots a while back so had to go
> this route.
>
> Thanks for volunteering to help!
>
> Erick
>
>
> On Thu, Sep 12, 2013 at 9:16 PM, Kranti Parisa 
> wrote:
>>
>> Guys,
>>
>> Seems there is not wiki page for Joins. I have been using/working Joins
>> and I want to start writing a page for the same on the Confluence wiki. How
>> can I get access for adding/editing the wiki pages?
>>
>> Thanks & Regards,
>> Kranti K Parisa
>> http://www.linkedin.com/in/krantiparisa
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766454#comment-13766454
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522907 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522907 ]

LUCENE-5207: Revert the dynamic class name. Its much better to use the "source 
file attribute". The class name is now constant (as every class gets own class 
loader) and looks like an internal class of the compiler. The stack trace is 
then looking like:
Throwable #1: java.lang.IllegalArgumentException: foobar
   at 
__randomizedtesting.SeedInfo.seed([3968E8DD2901F71C:4292B9595A397818]:0)
   at org.apache.lucene.util.MathUtil.log(MathUtil.java:51)
   at 
org.apache.lucene.expressions.js.JavascriptCompiler$CompiledExpression.evaluate(logn(2,
 0))
   at 
org.apache.lucene.expressions.js.TestJavascriptFunction.assertEvaluatesTo(TestJavascriptFunction.java:27)
   at 
org.apache.lucene.expressions.js.TestJavascriptFunction.testLognMethod(TestJavascriptFunction.java:178)
   at java.lang.Thread.run(Thread.java:724)

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4787) Join Contrib

2013-09-13 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766446#comment-13766446
 ] 

Joel Bernstein commented on SOLR-4787:
--

Kranti, the bjoin now supports multi-value joins. I'll work on getting the 
patch up here today.

> Join Contrib
> 
>
> Key: SOLR-4787
> URL: https://issues.apache.org/jira/browse/SOLR-4787
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.2.1
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, 
> SOLR-4787.patch, SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch
>
>
> This contrib provides a place where different join implementations can be 
> contributed to Solr. This contrib currently includes 3 join implementations. 
> The initial patch was generated from the Solr 4.3 tag. Because of changes in 
> the FieldCache API this patch will only build with Solr 4.2 or above.
> *HashSetJoinQParserPlugin aka hjoin*
> The hjoin provides a join implementation that filters results in one core 
> based on the results of a search in another core. This is similar in 
> functionality to the JoinQParserPlugin but the implementation differs in a 
> couple of important ways.
> The first way is that the hjoin is designed to work with int and long join 
> keys only. So, in order to use hjoin, int or long join keys must be included 
> in both the to and from core.
> The second difference is that the hjoin builds memory structures that are 
> used to quickly connect the join keys. So, the hjoin will need more memory 
> then the JoinQParserPlugin to perform the join.
> The main advantage of the hjoin is that it can scale to join millions of keys 
> between cores and provide sub-second response time. The hjoin should work 
> well with up to two million results from the fromIndex and tens of millions 
> of results from the main query.
> The hjoin supports the following features:
> 1) Both lucene query and PostFilter implementations. A *"cost"* > 99 will 
> turn on the PostFilter. The PostFilter will typically outperform the Lucene 
> query when the main query results have been narrowed down.
> 2) With the lucene query implementation there is an option to build the 
> filter with threads. This can greatly improve the performance of the query if 
> the main query index is very large. The "threads" parameter turns on 
> threading. For example *threads=6* will use 6 threads to build the filter. 
> This will setup a fixed threadpool with six threads to handle all hjoin 
> requests. Once the threadpool is created the hjoin will always use it to 
> build the filter. Threading does not come into play with the PostFilter.
> 3) The *size* local parameter can be used to set the initial size of the 
> hashset used to perform the join. If this is set above the number of results 
> from the fromIndex then the you can avoid hashset resizing which improves 
> performance.
> 4) Nested filter queries. The local parameter "fq" can be used to nest a 
> filter query within the join. The nested fq will filter the results of the 
> join query. This can point to another join to support nested joins.
> 5) Full caching support for the lucene query implementation. The filterCache 
> and queryResultCache should work properly even with deep nesting of joins. 
> Only the queryResultCache comes into play with the PostFilter implementation 
> because PostFilters are not cacheable in the filterCache.
> The syntax of the hjoin is similar to the JoinQParserPlugin except that the 
> plugin is referenced by the string "hjoin" rather then "join".
> fq=\{!hjoin fromIndex=collection2 from=id_i to=id_i threads=6 
> fq=$qq\}user:customer1&qq=group:5
> The example filter query above will search the fromIndex (collection2) for 
> "user:customer1" applying the local fq parameter to filter the results. The 
> lucene filter query will be built using 6 threads. This query will generate a 
> list of values from the "from" field that will be used to filter the main 
> query. Only records from the main query, where the "to" field is present in 
> the "from" list will be included in the results.
> The solrconfig.xml in the main query core must contain the reference to the 
> pjoin.
>  class="org.apache.solr.joins.HashSetJoinQParserPlugin"/>
> And the join contrib jars must be registed in the solrconfig.xml.
>  
>  
> *BitSetJoinQParserPlugin aka bjoin*
> The bjoin behaves exactly like the hjoin but uses a BitSet instead of a 
> HashSet to perform the underlying join. Because of this the bjoin is much 
> faster and can provide sub-second response times

[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module

2013-09-13 Thread Ajay Bhat (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766649#comment-13766649
 ] 

Ajay Bhat commented on LUCENE-2562:
---

Thanks for the comments, [~markrmil...@gmail.com]. The only change for 
AnalyzersTab.java is the tokenstream reset call.

I used the default theme as per the one used in original Thinlet Luke. If you'd 
like, I'll do a slight modification so Gray is the default theme.

> Make Luke a Lucene/Solr Module
> --
>
> Key: LUCENE-2562
> URL: https://issues.apache.org/jira/browse/LUCENE-2562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Mark Miller
>  Labels: gsoc2013
> Attachments: LUCENE-2562.patch, LUCENE-2562.patch, luke1.jpg, 
> luke2.jpg, luke3.jpg, Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, 
> Luke-ALE-4.png, Luke-ALE-5.png
>
>
> see
> "RE: Luke - in need of maintainer": 
> http://markmail.org/message/m4gsto7giltvrpuf
> "Web-based Luke": http://markmail.org/message/4xwps7p7ifltme5q
> I think it would be great if there was a version of Luke that always worked 
> with trunk - and it would also be great if it was easier to match Luke jars 
> with Lucene versions.
> While I'd like to get GWT Luke into the mix as well, I think the easiest 
> starting point is to straight port Luke to another UI toolkit before 
> abstracting out DTO objects that both GWT Luke and Pivot Luke could share.
> I've started slowly converting Luke's use of thinlet to Apache Pivot. I 
> haven't/don't have a lot of time for this at the moment, but I've plugged 
> away here and there over the past work or two. There is still a *lot* to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4221) Custom sharding

2013-09-13 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul resolved SOLR-4221.
--

Resolution: Fixed

> Custom sharding
> ---
>
> Key: SOLR-4221
> URL: https://issues.apache.org/jira/browse/SOLR-4221
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
>Assignee: Noble Paul
> Attachments: SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch, 
> SOLR-4221.patch, SOLR-4221.patch
>
>
> Features to let users control everything about sharding/routing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766644#comment-13766644
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1523016 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1523016 ]

LUCENE-5207: Minor cleanups, also mark all generated methods as SYNTHETIC 
because there exists no source code

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module

2013-09-13 Thread Ajay Bhat (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766678#comment-13766678
 ] 

Ajay Bhat commented on LUCENE-2562:
---

Re: ["borderColor" is not a valid style for org.apache.pivot.wtk.TextArea
"activeBackgroundColor" is not a valid style for org.apache.pivot.wtk.TextArea]

Duly noted. I'll make sure to look into it and take care of it in the next 
patch.

> Make Luke a Lucene/Solr Module
> --
>
> Key: LUCENE-2562
> URL: https://issues.apache.org/jira/browse/LUCENE-2562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Mark Miller
>  Labels: gsoc2013
> Attachments: LUCENE-2562.patch, LUCENE-2562.patch, luke1.jpg, 
> luke2.jpg, luke3.jpg, Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, 
> Luke-ALE-4.png, Luke-ALE-5.png
>
>
> see
> "RE: Luke - in need of maintainer": 
> http://markmail.org/message/m4gsto7giltvrpuf
> "Web-based Luke": http://markmail.org/message/4xwps7p7ifltme5q
> I think it would be great if there was a version of Luke that always worked 
> with trunk - and it would also be great if it was easier to match Luke jars 
> with Lucene versions.
> While I'd like to get GWT Luke into the mix as well, I think the easiest 
> starting point is to straight port Luke to another UI toolkit before 
> abstracting out DTO objects that both GWT Luke and Pivot Luke could share.
> I've started slowly converting Luke's use of thinlet to Apache Pivot. I 
> haven't/don't have a lot of time for this at the moment, but I've plugged 
> away here and there over the past work or two. There is still a *lot* to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3425) NRT Caching Dir to allow for exact memory usage, better buffer allocation and "global" cross indices control

2013-09-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766406#comment-13766406
 ] 

Michael McCandless commented on LUCENE-3425:


Hmm, I don't understand what AverageMergePolicy is doing?  Can you describe its 
purpose at a high level?  Somehow it's failing to merge those 40,000 small 
segments?

And offhand I don't know what changed between 4.3 and 4.4 that would cause 
AverageMergePolicy to stop merging small segments.

Maybe turn on IndexWriter's infoStream and watch which merges are being 
selected.

> NRT Caching Dir to allow for exact memory usage, better buffer allocation and 
> "global" cross indices control
> 
>
> Key: LUCENE-3425
> URL: https://issues.apache.org/jira/browse/LUCENE-3425
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 3.4, 4.0-ALPHA
>Reporter: Shay Banon
> Fix For: 5.0, 4.5
>
>
> A discussion on IRC raised several improvements that can be made to NRT 
> caching dir. Some of the problems it currently has are:
> 1. Not explicitly controlling the memory usage, which can result in overusing 
> memory (for example, large new segments being committed because refreshing is 
> too far behind).
> 2. Heap fragmentation because of constant allocation of (probably promoted to 
> old gen) byte buffers.
> 3. Not being able to control the memory usage across indices for multi index 
> usage within a single JVM.
> A suggested solution (which still needs to be ironed out) is to have a 
> BufferAllocator that controls allocation of byte[], and allow to return 
> unused byte[] to it. It will have a cap on the size of memory it allows to be 
> allocated.
> The NRT caching dir will use the allocator, which can either be provided (for 
> usage across several indices) or created internally. The caching dir will 
> also create a wrapped IndexOutput, that will flush to the main dir if the 
> allocator can no longer provide byte[] (exhausted).
> When a file is "flushed" from the cache to the main directory, it will return 
> all the currently allocated byte[] to the BufferAllocator to be reused by 
> other "files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5208) SnowballFilter to support minTokenLength

2013-09-13 Thread Markus Jelsma (JIRA)

Markus Jelsma created LUCENE-5208:
-

 Summary: SnowballFilter to support minTokenLength
 Key: LUCENE-5208
 URL: https://issues.apache.org/jira/browse/LUCENE-5208
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.4
Reporter: Markus Jelsma
 Fix For: 5.0
 Attachments: LUCENE-5208-trunk.patch

In some cases you don't want the stemmer to consider short tokens. Instead of 
modifying the snowball code, testing it, compiling it to Java code and the 
whole hassle, with this patch you can set a minTokenLength.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module

2013-09-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766622#comment-13766622
 ] 

Mark Miller commented on LUCENE-2562:
-

One comment - I think I like the gray theme for the default best.

Once I can apply a clean patch, I'll commit your current progress.

> Make Luke a Lucene/Solr Module
> --
>
> Key: LUCENE-2562
> URL: https://issues.apache.org/jira/browse/LUCENE-2562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Mark Miller
>  Labels: gsoc2013
> Attachments: LUCENE-2562.patch, LUCENE-2562.patch, luke1.jpg, 
> luke2.jpg, luke3.jpg, Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, 
> Luke-ALE-4.png, Luke-ALE-5.png
>
>
> see
> "RE: Luke - in need of maintainer": 
> http://markmail.org/message/m4gsto7giltvrpuf
> "Web-based Luke": http://markmail.org/message/4xwps7p7ifltme5q
> I think it would be great if there was a version of Luke that always worked 
> with trunk - and it would also be great if it was easier to match Luke jars 
> with Lucene versions.
> While I'd like to get GWT Luke into the mix as well, I think the easiest 
> starting point is to straight port Luke to another UI toolkit before 
> abstracting out DTO objects that both GWT Luke and Pivot Luke could share.
> I've started slowly converting Luke's use of thinlet to Apache Pivot. I 
> haven't/don't have a lot of time for this at the moment, but I've plugged 
> away here and there over the past work or two. There is still a *lot* to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module

2013-09-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766616#comment-13766616
 ] 

Mark Miller commented on LUCENE-2562:
-

Nice, thanks Ajay!

Unfortunately, I am having trouble with the patch - it won't cleanly apply to 
AnalyzersTab.java. I left that file as is and checked out the color scheming 
though - looks good!

> Make Luke a Lucene/Solr Module
> --
>
> Key: LUCENE-2562
> URL: https://issues.apache.org/jira/browse/LUCENE-2562
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Mark Miller
>  Labels: gsoc2013
> Attachments: LUCENE-2562.patch, LUCENE-2562.patch, luke1.jpg, 
> luke2.jpg, luke3.jpg, Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, 
> Luke-ALE-4.png, Luke-ALE-5.png
>
>
> see
> "RE: Luke - in need of maintainer": 
> http://markmail.org/message/m4gsto7giltvrpuf
> "Web-based Luke": http://markmail.org/message/4xwps7p7ifltme5q
> I think it would be great if there was a version of Luke that always worked 
> with trunk - and it would also be great if it was easier to match Luke jars 
> with Lucene versions.
> While I'd like to get GWT Luke into the mix as well, I think the easiest 
> starting point is to straight port Luke to another UI toolkit before 
> abstracting out DTO objects that both GWT Luke and Pivot Luke could share.
> I've started slowly converting Luke's use of thinlet to Apache Pivot. I 
> haven't/don't have a lot of time for this at the moment, but I've plugged 
> away here and there over the past work or two. There is still a *lot* to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766512#comment-13766512
 ] 

Mark Miller commented on SOLR-1301:
---

Got it - the jtidy output was very generic (failed, returned 1), but I worked 
out that the problem was some '<' and '>' in the javadoc of SolrReducer and 
another class or two. After addressing that and adding a couple package.html 
files,  the precommit ant task now passes.

I have a variety of items still on the TODO list, but I think the critical path 
to an initial commit is:

* Move the Solr Morphline commands in.

* Get the tests to run without a hacked test.policy file - see my comment above 
about FileSystem#mkDirs.

* Look at the final jar we produce and how it works with the dependencies (eg 
it's currently going to the extraction contrib for tika, etc).

The other outstanding issues are not blocking an initial commit I don't think.

Also, FYI, since I did not mention, the previous patch will run the mini 
cluster tests based on the tests.disableHdfs sys prop now, so that is checked 
off.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 4.5, 5.0
>
> Attachments: commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SolrRecordWriter.java
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-SmokeRelease-4.x - Build # 107 - Still Failing

2013-09-13 Thread Robert Muir

Maybe its not the OS but something depending on default
locale/encoding (yes i know, we specify it explicitly, but its clearly
a bug in javadoc) causing it to only be wrong on the linux box

On Fri, Sep 13, 2013 at 8:01 AM, Michael McCandless
 wrote:
> I just committed an attempted workaround.
>
> The problem is the {@link #getAttribute} turns into:
>
> ...You should always retrieve the wanted attributes using  href="../../../../org/apache/lucene/util/AttributeSource.html#getAttribute(java.lang.Class)">getAttribute(java.lang.Class)
> after adding...
>
> Ie, the javadocs gen failed to escape that .
>
> I've been unable to reproduce this; whenever I generate javadocs with
> Java 1.7.0_25 (on Linux, not FreeBSD) the  is properly escaped ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Sep 11, 2013 at 7:22 AM, Apache Jenkins Server
>  wrote:
>> Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/107/
>>
>> No tests ran.
>>
>> Build Log:
>> [...truncated 34293 lines...]
>> prepare-release-no-sign:
>> [mkdir] Created dir: 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease
>>  [copy] Copying 416 files to 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/lucene
>>  [copy] Copying 194 files to 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/solr
>>  [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
>>  [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
>>  [exec] NOTE: output encoding is US-ASCII
>>  [exec]
>>  [exec] Load release URL 
>> "file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/"...
>>  [exec]
>>  [exec] Test Lucene...
>>  [exec]   test basics...
>>  [exec]   get KEYS
>>  [exec] 0.1 MB in 0.02 sec (4.9 MB/sec)
>>  [exec]   check changes HTML...
>>  [exec]   download lucene-4.5.0-src.tgz...
>>  [exec] 27.1 MB in 0.04 sec (659.7 MB/sec)
>>  [exec] verify md5/sha1 digests
>>  [exec]   download lucene-4.5.0.tgz...
>>  [exec] 49.1 MB in 0.07 sec (661.3 MB/sec)
>>  [exec] verify md5/sha1 digests
>>  [exec]   download lucene-4.5.0.zip...
>>  [exec] 58.9 MB in 0.09 sec (646.8 MB/sec)
>>  [exec] verify md5/sha1 digests
>>  [exec]   unpack lucene-4.5.0.tgz...
>>  [exec] verify JAR/WAR metadata...
>>  [exec] test demo with 1.6...
>>  [exec]   got 5723 hits for query "lucene"
>>  [exec] test demo with 1.7...
>>  [exec]   got 5723 hits for query "lucene"
>>  [exec] check Lucene's javadoc JAR
>>  [exec]
>>  [exec] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeReleaseTmp/unpack/lucene-4.5.0/docs/core/org/apache/lucene/util/AttributeSource.html
>>  [exec]   broken details HTML: Method Detail: addAttributeImpl: closing 
>> "" does not match opening ""
>>  [exec]   broken details HTML: Method Detail: getAttribute: closing 
>> "" does not match opening ""
>>  [exec] Traceback (most recent call last):
>>  [exec]   File 
>> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>>  line 1450, in 
>>  [exec] main()
>>  [exec]   File 
>> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>>  line 1394, in main
>>  [exec] smokeTest(baseURL, svnRevision, version, tmpDir, isSigned, 
>> testArgs)
>>  [exec]   File 
>> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>>  line 1431, in smokeTest
>>  [exec] unpackAndVerify('lucene', tmpDir, artifact, svnRevision, 
>> version, testArgs)
>>  [exec]   File 
>> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>>  line 607, in unpackAndVerify
>>  [exec] verifyUnpacked(project, artifact, unpackPath, svnRevision, 
>> version, testArgs)
>>  [exec]   File 
>> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>>  line 786, in verifyUnpacked
>>  [exec] checkJavadocpath('%s/docs' % unpackPath)
>>  [exec]   File 
>> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>>  line 904, in checkJavadocpath
>>  [exec] raise RuntimeError('missing javadocs package summaries!')
>>  [exec] RuntimeError: missing javadocs package summaries!
>>
>> BUILD FAILED
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/build.xml:321:
>>  exec returned: 1
>>
>> Total time: 20 minutes 10 seconds
>> Build step 'Invoke Ant' marked build as failure
>> Email was triggered for: Failure
>> Sending email for trigg

Re: SimplePostToolTest very slow

2013-09-13 Thread Shai Erera

Yes it happens with the same seed. And it happened both times I ran it:
from Ant it takes ~280s, from eclipse ~1s.

Both times the test seems successful, and the outputs looks the same, so I
guess it means it does the same thing when run from eclipse and ant.

Am I the only one that experiences this slowness?

Shai


On Fri, Sep 13, 2013 at 7:06 PM, Dawid Weiss
wrote:

> Is it with the same seed?
>
> Dawid
>
> On Fri, Sep 13, 2013 at 1:04 PM, Michael McCandless
>  wrote:
> > It passes when run from Eclipse?  It seems crazy that it can take 280
> > sec from ant but 1 sec from Eclipse.  Maybe it's not actually running,
> > when running from Eclipse?
> >
> > +1 for @Nightly it we can't get to the bottom of it ...
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Thu, Sep 12, 2013 at 5:03 PM, Shai Erera  wrote:
> >> Hi
> >>
> >> I was running Solr tests now and thought they hung, but eventually they
> >> continued and I noticed that SimplePostToolTest took 280s to complete. I
> >> tried from eclipse, and it took 1s. Tried from Ant again, 276s. I
> compared
> >> (briefly) the outputs of the test from eclipse and Ant, and they look
> >> similar.
> >>
> >> Is this expected? Maybe when the test runs from Ant it does more work
> (i.e.
> >> system properties that are sent from build.xml but not in eclipse cause
> it
> >> to index more data or something?). If it helps, here's what the test
> prints:
> >>
> >>[junit4] Suite: org.apache.solr.util.SimplePostToolTest
> >>[junit4]   2> log4j:WARN No such property [conversionPattern] in
> >> org.apache.solr.util.SolrLogLayout.
> >>[junit4]   2> 396 T24 oas.SolrTestCaseJ4.setUp ###Starting testIsOn
> >>[junit4]   2> 23126 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testIsOn
> >>[junit4] OK  22.8s | SimplePostToolTest.testIsOn
> >>[junit4]   2> 23163 T24 oas.SolrTestCaseJ4.setUp ###Starting
> >> testAppendUrlPath
> >>[junit4]   2> 54667 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> >> testAppendUrlPath
> >>[junit4] OK  31.5s | SimplePostToolTest.testAppendUrlPath
> >>[junit4]   2> 54682 T24 oas.SolrTestCaseJ4.setUp ###Starting
> >> testGuessType
> >>[junit4]   2> 77185 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> >> testGuessType
> >>[junit4] OK  22.5s | SimplePostToolTest.testGuessType
> >>[junit4]   2> 77198 T24 oas.SolrTestCaseJ4.setUp ###Starting
> >> testTypeSupported
> >>[junit4]   2> 99701 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> >> testTypeSupported
> >>[junit4] OK  22.5s | SimplePostToolTest.testTypeSupported
> >>[junit4]   2> 99712 T24 oas.SolrTestCaseJ4.setUp ###Starting
> >> testRobotsExclusion
> >>[junit4]   2> 122214 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> >> testRobotsExclusion
> >>[junit4] OK  22.5s | SimplePostToolTest.testRobotsExclusion
> >>[junit4]   2> 15 T24 oas.SolrTestCaseJ4.setUp ###Starting
> >> testParseArgsAndInit
> >>[junit4]   2> 144727 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> >> testParseArgsAndInit
> >>[junit4] OK  22.5s | SimplePostToolTest.testParseArgsAndInit
> >>[junit4]   2> 144736 T24 oas.SolrTestCaseJ4.setUp ###Starting
> >> testDoWebMode
> >>[junit4]   2> SimplePostTool: WARNING: The URL
> >> http://example.com/disallowed returned a HTTP result status of 403
> >>[junit4]   2> 185795 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> >> testDoWebMode
> >>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
> >>[junit4]   1> POSTed web resource http://example.com (depth: 0)
> >>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
> >>[junit4]   1> POSTed web resource http://example.com/page2 (depth:
> 1)
> >>[junit4]   1> POSTed web resource http://example.com/page1 (depth:
> 1)
> >>[junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
> >>[junit4]   1> POSTed web resource http://example.com/page1/foo(depth: 2)
> >>[junit4]   1> Entering crawl at level 3 (1 links total, 1 new)
> >>[junit4]   1> POSTed web resource http://example.com/page1/foo/bar
> >> (depth: 3)
> >>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
> >>[junit4]   1> POSTed web resource http://example.com (depth: 0)
> >>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
> >>[junit4]   1> POSTed web resource http://example.com/page2 (depth:
> 1)
> >>[junit4]   1> POSTed web resource http://example.com/page1 (depth:
> 1)
> >>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
> >>[junit4]   1> POSTed web resource http://example.com (depth: 0)
> >>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
> >>[junit4]   1> POSTed web resource http://example.com/page2 (depth:
> 1)
> >>[junit4]   1> POSTed web resource http://example.com/page1 (depth:
> 1)
> >>[junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
> >>[junit4]   1> POSTed web resource http://example.com/dis

[jira] [Updated] (LUCENE-5123) invert the codec postings API

2013-09-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5123:
---

Attachment: LUCENE-5123.patch

New patch, resolving all nocommits.  Tests and ant precommit pass. I
think it's ready!

I moved all the MappingMulti* from oal.codecs to oal.index and made
them package private.

We can later tackle cutting over different postings formats to the
pull API...


> invert the codec postings API
> -
>
> Key: LUCENE-5123
> URL: https://issues.apache.org/jira/browse/LUCENE-5123
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Robert Muir
>Assignee: Michael McCandless
> Fix For: 5.0
>
> Attachments: LUCENE-5123.patch, LUCENE-5123.patch, LUCENE-5123.patch, 
> LUCENE-5123.patch, LUCENE-5123.patch
>
>
> Currently FieldsConsumer/PostingsConsumer/etc is a "push" oriented api, e.g. 
> FreqProxTermsWriter streams the postings at flush, and the default merge() 
> takes the incoming codec api and filters out deleted docs and "pushes" via 
> same api (but that can be overridden).
> It could be cleaner if we allowed for a "pull" model instead (like 
> DocValues). For example, maybe FreqProxTermsWriter could expose a Terms of 
> itself and just passed this to the codec consumer.
> This would give the codec more flexibility to e.g. do multiple passes if it 
> wanted to do things like encode high-frequency terms more efficiently with a 
> bitset-like encoding or other things...
> A codec can try to do things like this to some extent today, but its very 
> difficult (look at buffering in Pulsing). We made this change with DV and it 
> made a lot of interesting optimizations easy to implement...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766545#comment-13766545
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522972 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522972 ]

LUCENE-5207: Refactor compiler to use final fields and simplify initialization

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766534#comment-13766534
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522967 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522967 ]

LUCENE-5207: Remove classloader field (is not needed, we call only once)

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-09-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766494#comment-13766494
 ] 

Mark Miller commented on SOLR-1301:
---

I've worked out the javadoc warnings - that has led to some new issue(s) with 
jtidy. It's failing on SolrReducer.html.

> Add a Solr contrib that allows for building Solr indexes via Hadoop's 
> Map-Reduce.
> -
>
> Key: SOLR-1301
> URL: https://issues.apache.org/jira/browse/SOLR-1301
> Project: Solr
>  Issue Type: New Feature
>Reporter: Andrzej Bialecki 
>Assignee: Mark Miller
> Fix For: 4.5, 5.0
>
> Attachments: commons-logging-1.0.4.jar, 
> commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
> hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
> log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch, 
> SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
> SolrRecordWriter.java
>
>
> This patch contains  a contrib module that provides distributed indexing 
> (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
> twofold:
> * provide an API that is familiar to Hadoop developers, i.e. that of 
> OutputFormat
> * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
> SolrOutputFormat consumes data produced by reduce tasks directly, without 
> storing it in intermediate files. Furthermore, by using an 
> EmbeddedSolrServer, the indexing task is split into as many parts as there 
> are reducers, and the data to be indexed is not sent over the network.
> Design
> --
> Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
> which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
> instantiates an EmbeddedSolrServer, and it also instantiates an 
> implementation of SolrDocumentConverter, which is responsible for turning 
> Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
> batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
> task completes, and the OutputFormat is closed, SolrRecordWriter calls 
> commit() and optimize() on the EmbeddedSolrServer.
> The API provides facilities to specify an arbitrary existing solr.home 
> directory, from which the conf/ and lib/ files will be taken.
> This process results in the creation of as many partial Solr home directories 
> as there were reduce tasks. The output shards are placed in the output 
> directory on the default filesystem (e.g. HDFS). Such part-N directories 
> can be used to run N shard servers. Additionally, users can specify the 
> number of reduce tasks, in particular 1 reduce task, in which case the output 
> will consist of a single shard.
> An example application is provided that processes large CSV files and uses 
> this API. It uses a custom CSV processing to avoid (de)serialization overhead.
> This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this 
> issue, you should put it in contrib/hadoop/lib.
> Note: the development of this patch was sponsored by an anonymous contributor 
> and approved for release under Apache License.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #446: POMs out of sync

2013-09-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/446/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch

Error Message:
expected:<5> but was:<4>

Stack Trace:
java.lang.AssertionError: expected:<5> but was:<4>
at 
__randomizedtesting.SeedInfo.seed([F433A3E68CB75F37:75D52DFEFBE83F0B]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:175)




Build Log:
[...truncated 24975 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766482#comment-13766482
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522925 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522925 ]

LUCENE-5207: Add a unused test method to make sure that if we change the 
FunctionValues interface we get compile error. Also make the class format 
version a constant for easy maintenance (once we backport)

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3425) NRT Caching Dir to allow for exact memory usage, better buffer allocation and "global" cross indices control

2013-09-13 Thread caviler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766430#comment-13766430
 ] 

caviler commented on LUCENE-3425:
-

Because inside IndexSearcher.search method, it use thread pool to execute 
Segment.search in every segments,

so, if some segment is too big,  it will causes this big segment's searcher too 
slow, eventually entire search method will too slow.

example, we have 2G index files.

=
use AverageMergePolicy, IndexSearcher.search spent time = 1s

segment sizesegment.search spent time
  1 500M  1s
  2 500M  1s
  3 500M  1s
  4 500M  1s

=
use other MergePolicy, IndexSearcher.search spent time = 5s
segment size   segment.search spent time
  1 2000M5s

=

Why not use LogByteSizeMergePolicy but AverageMergePolicy?

Because:
1. I want every semgent as small as possible!
2. I want semgent as more as possible!

we don't known how big of one segment size when entire index is growing, so 
can't use LogByteSizeMergePolicy.

if use LogByteSizeMergePolicy and setMaxMergeMB(200M):

index =   200M, LogByteSizeMergePolicy =   1 segment(per 200M),  
AverageMergePolicy = 4 segments(per 50M) 
index = 2000M, LogByteSizeMergePolicy = 10 segment(per 200M),  
AverageMergePolicy = 4 segments(per 500M) 








> NRT Caching Dir to allow for exact memory usage, better buffer allocation and 
> "global" cross indices control
> 
>
> Key: LUCENE-3425
> URL: https://issues.apache.org/jira/browse/LUCENE-3425
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 3.4, 4.0-ALPHA
>Reporter: Shay Banon
> Fix For: 5.0, 4.5
>
>
> A discussion on IRC raised several improvements that can be made to NRT 
> caching dir. Some of the problems it currently has are:
> 1. Not explicitly controlling the memory usage, which can result in overusing 
> memory (for example, large new segments being committed because refreshing is 
> too far behind).
> 2. Heap fragmentation because of constant allocation of (probably promoted to 
> old gen) byte buffers.
> 3. Not being able to control the memory usage across indices for multi index 
> usage within a single JVM.
> A suggested solution (which still needs to be ironed out) is to have a 
> BufferAllocator that controls allocation of byte[], and allow to return 
> unused byte[] to it. It will have a cap on the size of memory it allows to be 
> allocated.
> The NRT caching dir will use the allocator, which can either be provided (for 
> usage across several indices) or created internally. The caching dir will 
> also create a wrapped IndexOutput, that will flush to the main dir if the 
> allocator can no longer provide byte[] (exhausted).
> When a file is "flushed" from the cache to the main directory, it will return 
> all the currently allocated byte[] to the BufferAllocator to be reused by 
> other "files".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SimplePostToolTest very slow

2013-09-13 Thread Michael McCandless

It passes when run from Eclipse?  It seems crazy that it can take 280
sec from ant but 1 sec from Eclipse.  Maybe it's not actually running,
when running from Eclipse?

+1 for @Nightly it we can't get to the bottom of it ...

Mike McCandless

http://blog.mikemccandless.com


On Thu, Sep 12, 2013 at 5:03 PM, Shai Erera  wrote:
> Hi
>
> I was running Solr tests now and thought they hung, but eventually they
> continued and I noticed that SimplePostToolTest took 280s to complete. I
> tried from eclipse, and it took 1s. Tried from Ant again, 276s. I compared
> (briefly) the outputs of the test from eclipse and Ant, and they look
> similar.
>
> Is this expected? Maybe when the test runs from Ant it does more work (i.e.
> system properties that are sent from build.xml but not in eclipse cause it
> to index more data or something?). If it helps, here's what the test prints:
>
>[junit4] Suite: org.apache.solr.util.SimplePostToolTest
>[junit4]   2> log4j:WARN No such property [conversionPattern] in
> org.apache.solr.util.SolrLogLayout.
>[junit4]   2> 396 T24 oas.SolrTestCaseJ4.setUp ###Starting testIsOn
>[junit4]   2> 23126 T24 oas.SolrTestCaseJ4.tearDown ###Ending testIsOn
>[junit4] OK  22.8s | SimplePostToolTest.testIsOn
>[junit4]   2> 23163 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testAppendUrlPath
>[junit4]   2> 54667 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testAppendUrlPath
>[junit4] OK  31.5s | SimplePostToolTest.testAppendUrlPath
>[junit4]   2> 54682 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testGuessType
>[junit4]   2> 77185 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testGuessType
>[junit4] OK  22.5s | SimplePostToolTest.testGuessType
>[junit4]   2> 77198 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testTypeSupported
>[junit4]   2> 99701 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testTypeSupported
>[junit4] OK  22.5s | SimplePostToolTest.testTypeSupported
>[junit4]   2> 99712 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testRobotsExclusion
>[junit4]   2> 122214 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testRobotsExclusion
>[junit4] OK  22.5s | SimplePostToolTest.testRobotsExclusion
>[junit4]   2> 15 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testParseArgsAndInit
>[junit4]   2> 144727 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testParseArgsAndInit
>[junit4] OK  22.5s | SimplePostToolTest.testParseArgsAndInit
>[junit4]   2> 144736 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testDoWebMode
>[junit4]   2> SimplePostTool: WARNING: The URL
> http://example.com/disallowed returned a HTTP result status of 403
>[junit4]   2> 185795 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testDoWebMode
>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
>[junit4]   1> POSTed web resource http://example.com (depth: 0)
>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
>[junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
>[junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
>[junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
>[junit4]   1> POSTed web resource http://example.com/page1/foo (depth: 2)
>[junit4]   1> Entering crawl at level 3 (1 links total, 1 new)
>[junit4]   1> POSTed web resource http://example.com/page1/foo/bar
> (depth: 3)
>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
>[junit4]   1> POSTed web resource http://example.com (depth: 0)
>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
>[junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
>[junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
>[junit4]   1> Entering crawl at level 0 (1 links total, 1 new)
>[junit4]   1> POSTed web resource http://example.com (depth: 0)
>[junit4]   1> Entering crawl at level 1 (2 links total, 2 new)
>[junit4]   1> POSTed web resource http://example.com/page2 (depth: 1)
>[junit4]   1> POSTed web resource http://example.com/page1 (depth: 1)
>[junit4]   1> Entering crawl at level 2 (2 links total, 2 new)
>[junit4]   1> POSTed web resource http://example.com/disallowed (depth:
> 2)
>[junit4]   1> POSTed web resource http://example.com/page1/foo (depth: 2)
>[junit4]   1> Entering crawl at level 3 (1 links total, 1 new)
>[junit4]   1> POSTed web resource http://example.com/page1/foo/bar
> (depth: 3)
>[junit4] OK  41.1s | SimplePostToolTest.testDoWebMode
>[junit4]   2> 185806 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testAppendParam
>[junit4]   2> 208310 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testAppendParam
>[junit4] OK  22.5s | SimplePostToolTest.testAppendParam
>[junit4]   2> 208320 T24 oas.SolrTestCaseJ4.setUp ###Starting
> testComputeFullUrl
>[junit4]   2> 230822 T24 oas.SolrTestCaseJ4.tearDown ###Ending
> testComputeFullUrl
>[junit4] OK  22

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766392#comment-13766392
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522858 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522858 ]

LUCENE-5207: Remove classloader constructor, because it makes no sense to use 
any other classloader

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766428#comment-13766428
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522888 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522888 ]

LUCENE-5207: Remove stupidity... :(

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_40) - Build # 7427 - Still Failing!

2013-09-13 Thread Michael McCandless

This doesn't repro on current trunk ...

Mike McCandless

http://blog.mikemccandless.com


On Thu, Sep 12, 2013 at 5:26 AM, Policeman Jenkins Server
 wrote:
> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7427/
> Java: 64bit/jdk1.7.0_40 -XX:-UseCompressedOops -XX:+UseG1GC
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.classification.KNearestNeighborClassifierTest.testPerformance
>
> Error Message:
> CheckIndex failed
>
> Stack Trace:
> java.lang.RuntimeException: CheckIndex failed
> at 
> __randomizedtesting.SeedInfo.seed([68F81CB00E47182B:AF19EE9265F32084]:0)
> at org.apache.lucene.util._TestUtil.checkIndex(_TestUtil.java:227)
> at 
> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:659)
> at 
> org.apache.lucene.classification.ClassificationTestBase.tearDown(ClassificationTestBase.java:70)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:795)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at 
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at java.lang.Thread.run(Thread.java:724)
>
>
>
>
> Build Log:
> [...truncated 6618 lines...]
>[junit4] Suite: 
> org.apache.lucene.classification.KNearestNeighborClassifierTest
>[junit4]   1> CheckIndex failed
>[junit4]   1> Segments file=segments_3 numSegments=2 version=5.0 format=
>[junit4]   1>   1 of 2: name=_0 docCount=898
>[ju

Re: [JENKINS] Lucene-Solr-SmokeRelease-4.x - Build # 107 - Still Failing

2013-09-13 Thread Michael McCandless

I just committed an attempted workaround.

The problem is the {@link #getAttribute} turns into:

...You should always retrieve the wanted attributes using getAttribute(java.lang.Class)
after adding...

Ie, the javadocs gen failed to escape that .

I've been unable to reproduce this; whenever I generate javadocs with
Java 1.7.0_25 (on Linux, not FreeBSD) the  is properly escaped ...

Mike McCandless

http://blog.mikemccandless.com


On Wed, Sep 11, 2013 at 7:22 AM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/107/
>
> No tests ran.
>
> Build Log:
> [...truncated 34293 lines...]
> prepare-release-no-sign:
> [mkdir] Created dir: 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease
>  [copy] Copying 416 files to 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/lucene
>  [copy] Copying 194 files to 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/solr
>  [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
>  [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
>  [exec] NOTE: output encoding is US-ASCII
>  [exec]
>  [exec] Load release URL 
> "file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/"...
>  [exec]
>  [exec] Test Lucene...
>  [exec]   test basics...
>  [exec]   get KEYS
>  [exec] 0.1 MB in 0.02 sec (4.9 MB/sec)
>  [exec]   check changes HTML...
>  [exec]   download lucene-4.5.0-src.tgz...
>  [exec] 27.1 MB in 0.04 sec (659.7 MB/sec)
>  [exec] verify md5/sha1 digests
>  [exec]   download lucene-4.5.0.tgz...
>  [exec] 49.1 MB in 0.07 sec (661.3 MB/sec)
>  [exec] verify md5/sha1 digests
>  [exec]   download lucene-4.5.0.zip...
>  [exec] 58.9 MB in 0.09 sec (646.8 MB/sec)
>  [exec] verify md5/sha1 digests
>  [exec]   unpack lucene-4.5.0.tgz...
>  [exec] verify JAR/WAR metadata...
>  [exec] test demo with 1.6...
>  [exec]   got 5723 hits for query "lucene"
>  [exec] test demo with 1.7...
>  [exec]   got 5723 hits for query "lucene"
>  [exec] check Lucene's javadoc JAR
>  [exec]
>  [exec] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeReleaseTmp/unpack/lucene-4.5.0/docs/core/org/apache/lucene/util/AttributeSource.html
>  [exec]   broken details HTML: Method Detail: addAttributeImpl: closing 
> "" does not match opening ""
>  [exec]   broken details HTML: Method Detail: getAttribute: closing 
> "" does not match opening ""
>  [exec] Traceback (most recent call last):
>  [exec]   File 
> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>  line 1450, in 
>  [exec] main()
>  [exec]   File 
> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>  line 1394, in main
>  [exec] smokeTest(baseURL, svnRevision, version, tmpDir, isSigned, 
> testArgs)
>  [exec]   File 
> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>  line 1431, in smokeTest
>  [exec] unpackAndVerify('lucene', tmpDir, artifact, svnRevision, 
> version, testArgs)
>  [exec]   File 
> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>  line 607, in unpackAndVerify
>  [exec] verifyUnpacked(project, artifact, unpackPath, svnRevision, 
> version, testArgs)
>  [exec]   File 
> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>  line 786, in verifyUnpacked
>  [exec] checkJavadocpath('%s/docs' % unpackPath)
>  [exec]   File 
> "/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py",
>  line 904, in checkJavadocpath
>  [exec] raise RuntimeError('missing javadocs package summaries!')
>  [exec] RuntimeError: missing javadocs package summaries!
>
> BUILD FAILED
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/build.xml:321:
>  exec returned: 1
>
> Total time: 20 minutes 10 seconds
> Build step 'Invoke Ant' marked build as failure
> Email was triggered for: Failure
> Sending email for trigger: Failure
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766411#comment-13766411
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522873 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522873 ]

LUCENE-5207: Create the class name of generated classes from the parsed text

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5208) SnowballFilter to support minTokenLength

2013-09-13 Thread Markus Jelsma (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated LUCENE-5208:
--

Attachment: LUCENE-5208-trunk.patch

Patch for trunk.

> SnowballFilter to support minTokenLength
> 
>
> Key: LUCENE-5208
> URL: https://issues.apache.org/jira/browse/LUCENE-5208
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Affects Versions: 4.4
>Reporter: Markus Jelsma
> Fix For: 5.0
>
> Attachments: LUCENE-5208-trunk.patch
>
>
> In some cases you don't want the stemmer to consider short tokens. Instead of 
> modifying the snowball code, testing it, compiling it to Java code and the 
> whole hassle, with this patch you can set a minTokenLength.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5207) lucene expressions module

2013-09-13 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766416#comment-13766416
 ] 

ASF subversion and git services commented on LUCENE-5207:
-

Commit 1522877 from [~thetaphi] in branch 'dev/branches/lucene5207'
[ https://svn.apache.org/r1522877 ]

LUCENE-5207: Limit the maximum class name length

> lucene expressions module
> -
>
> Key: LUCENE-5207
> URL: https://issues.apache.org/jira/browse/LUCENE-5207
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Ryan Ernst
> Attachments: LUCENE-5207.patch
>
>
> Expressions are geared at defining an alternative ranking function (e.g. 
> incorporating the text relevance score and other field values/ranking
> signals). So they are conceptually much more like ElasticSearch's scripting 
> support (http://www.elasticsearch.org/guide/reference/modules/scripting/) 
> than solr's function queries.
> Some additional notes:
> * In addition to referring to other fields, they can also refer to other 
> expressions, so they can be used as "computed fields".
> * You can rank documents easily by multiple expressions (its a SortField at 
> the end), e.g. Sort by year descending, then some function of score price and 
> time ascending.
> * The provided javascript expression syntax is much more efficient than using 
> a scripting engine, because it does not have dynamic typing (compiles to 
> .class files that work on doubles). Performance is similar to writing a 
> custom FieldComparator yourself, but much easier to do.
> * We have solr integration to contribute in the future, but this is just the 
> standalone lucene part as a start. Since lucene has no schema, it includes an 
> implementation of Bindings (SimpleBindings) that maps variable names to 
> SortField's or other expressions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Joins on the confluence wiki

2013-09-13 Thread Erick Erickson

Just let us know your Wiki user ID and we'll add you
to the approved list right away.

Had some trouble with spam bots a while back so had to go
this route.

Thanks for volunteering to help!

Erick

On Thu, Sep 12, 2013 at 9:16 PM, Kranti Parisa wrote:

> Guys,
>
> Seems there is not wiki page for Joins. I have been using/working Joins
> and I want to start writing a page for the same on the Confluence wiki. How
> can I get access for adding/editing the wiki pages?
>
> Thanks & Regards,
> Kranti K Parisa
> http://www.linkedin.com/in/krantiparisa
>
>

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_40) - Build # 7427 - Still Failing!

2013-09-13 Thread Robert Muir

Maybe it will reproduce with the master seed.


On Fri, Sep 13, 2013 at 7:14 AM, Michael McCandless
 wrote:
> This doesn't repro on current trunk ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, Sep 12, 2013 at 5:26 AM, Policeman Jenkins Server
>  wrote:
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7427/
>> Java: 64bit/jdk1.7.0_40 -XX:-UseCompressedOops -XX:+UseG1GC
>>
>> 1 tests failed.
>> REGRESSION:  
>> org.apache.lucene.classification.KNearestNeighborClassifierTest.testPerformance
>>
>> Error Message:
>> CheckIndex failed
>>
>> Stack Trace:
>> java.lang.RuntimeException: CheckIndex failed
>> at 
>> __randomizedtesting.SeedInfo.seed([68F81CB00E47182B:AF19EE9265F32084]:0)
>> at org.apache.lucene.util._TestUtil.checkIndex(_TestUtil.java:227)
>> at 
>> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:659)
>> at 
>> org.apache.lucene.classification.ClassificationTestBase.tearDown(ClassificationTestBase.java:70)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:795)
>> at 
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> at 
>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>> at java.lang.Thread.run(Thread.java:724)
>>
>>
>>
>>
>> Build Log:
>> [...truncated 6618 lines...]
>>[junit4] Suite: 
>> org.apache.lucene.cla

1 2 >

1 - 100 of 131 matches

Mail list logo