[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-03 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673096#comment-13673096
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Nope, only that. It's fun to see how everything else goes bust when you run 
those tests on that dead collections branch though.
I'll run those microbenchmarks when I get a spare minute.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-03 Thread Robin Anil (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673072#comment-13673072
 ] 

Robin Anil commented on MAHOUT-1225:


Yes I saw that, thats why I committed your patch. I thought there was something 
else 

http://svn.apache.org/viewvc/mahout/trunk/math/src/main/java-templates/org/apache/mahout/math/map/OpenObjectValueTypeHashMap.java.t?r1=1488607&r2=1488606&pathrev=1488607

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-03 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673031#comment-13673031
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Take a look at this test:
{code}
@Test
public void testClearTable() throws Exception {
OpenObjectIntHashMap m = new OpenObjectIntHashMap();
m.clear(); // rehash from the default capacity to the next prime after 
1 (3).
m.put(1, 2);
m.clear(); // Should clear internal references.

Field tableField = m.getClass().getDeclaredField("table");
tableField.setAccessible(true);
Object[] table = (Object[]) tableField.get(m);

assertEquals(
new HashSet(Arrays.asList(new Object [] { null } )),
new HashSet(Arrays.asList(table)));
}
{code}

This fails because clear() does not explicitly erase the table of references. 
It does call rehash but not always (not if there's no need) in which case the 
references stay hard-linked. The fix is to:

{code}
   public void clear() {
 Arrays.fill(this.state, FREE);
+Arrays.fill(this.table, null);
+
 distinct = 0;
 freeEntries = table.length; // delta
 trimToSize();
{code}

You could avoid this by returning a boolean from trimToSize() and checking 
whether internal buffers have been reallocated (and thus references freed).

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-03 Thread Robin Anil (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672909#comment-13672909
 ] 

Robin Anil commented on MAHOUT-1225:


Could you elaborate on the buggy scenario. I dont see an option to reopen 
myself.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-02 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672852#comment-13672852
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Ehm. I've closed this issue as per Robin's comment above but I don't think this 
was the right way to go -- this should have been left open (with a fixed 
resolution) until a release is made. Apologies for the noise. I can't reopen it 
now -- probably missing some Jira's karma to do this. Please correct my mistake 
if you have admin rights, reopen and then bulk close at release time. Thanks!

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-02 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672695#comment-13672695
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Thanks Robin!

Jake - there was still a small issue with clearing references -- a potential 
memory leak, although in practice probably not encountered. See the test case 
that was previously failing and you'll see a potential buggy scenario.



> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-02 Thread Robin Anil
Look at Dawid's latest patch. OpenObject.* fixes and randomized tests
On Jun 2, 2013 7:10 AM, "Jake Mannix (JIRA)"  wrote:

>
> [
> https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672318#comment-13672318]
>
> Jake Mannix commented on MAHOUT-1225:
> -
>
> What exactly did you end up submitting Robin?  I thought we had determined
> that this was fixed already?  You just added the new randomized testing
> that Dawid added?  OpenObjectValueTypeHashMap still had the issue?
>
> > Sets and maps incorrectly clear() their state arrays (potential endless
> loops)
> >
> --
> >
> > Key: MAHOUT-1225
> > URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> > Project: Mahout
> >  Issue Type: Bug
> >  Components: Math
> >Affects Versions: 0.7
> > Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths
> collections (Set) 0.7, hppc 0.4.3
> >Reporter: Sophie Sperner
> >Assignee: Dawid Weiss
> >  Labels: hashset, java, mahout, test
> > Fix For: 0.7
> >
> > Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch,
> MAHOUT-1225.patch, MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
> >
> >   Original Estimate: 48h
> >  Remaining Estimate: 48h
> >
> > The code I attached hangs on forever, Eclipse does not print me its
> stack trace because it does not terminate the program. So I decided to make
> a small test.java file that you can easily run.
> > This code has the main function that simply runs getItemList() method
> which successfully executes getDataset() method (here please download
> mushroom.dat dataset and set the full path into filePath string variable)
> and the hangs on (the problem happens on a fourth columnValues.add() call).
> After the dataset was taken into X array, the code simply goes through X
> column by column and searches for different items in it.
> > If you uncomment IntSet columnValues = new IntOpenHashSet(); and
> corresponding import headers then everything will work just fine (you will
> also need to include hppc jar file found here
> http://labs.carrotsearch.com/hppc.html or below in the attachment).
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-01 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672318#comment-13672318
 ] 

Jake Mannix commented on MAHOUT-1225:
-

What exactly did you end up submitting Robin?  I thought we had determined that 
this was fixed already?  You just added the new randomized testing that Dawid 
added?  OpenObjectValueTypeHashMap still had the issue?

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672258#comment-13672258
 ] 

Hudson commented on MAHOUT-1225:


Integrated in Mahout-Quality #2026 (See 
[https://builds.apache.org/job/Mahout-Quality/2026/])
MAHOUT-1225 Nasty bug in Mahout collections. Sets and maps incorrectly 
clear() their state arrays (potential endless loops) - author: dweiss (Revision 
1488607)

 Result = SUCCESS
robinanil : 
Files : 
* /mahout/trunk/math/pom.xml
* 
/mahout/trunk/math/src/main/java-templates/org/apache/mahout/math/map/OpenObjectValueTypeHashMap.java.t
* /mahout/trunk/math/src/test/java/org/apache/mahout/math/randomized
* 
/mahout/trunk/math/src/test/java/org/apache/mahout/math/randomized/RandomBlasting.java


> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-06-01 Thread Robin Anil (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672243#comment-13672243
 ] 

Robin Anil commented on MAHOUT-1225:


I am going to submit this for Dawid to close this. Excellent catch and even 
better testing.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Suneel Marthi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665379#comment-13665379
 ] 

Suneel Marthi commented on MAHOUT-1225:
---

Agree with Jake, its the same situation at my work place; our hadoop cluster is 
stuck at Java 1.6.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665367#comment-13665367
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Could somebody with permissions remove that dead branch of collections from 
SVN? Alternatively I'd at least remove the contents of the trunk and place a 
simple README file redirecting to the new location of current development 
branch.

Not that it wasn't a lot of fun debugging but we can save somebody else from 
the mistake I made. :)

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665361#comment-13665361
 ] 

Jake Mannix commented on MAHOUT-1225:
-

I'm not sure everyone's hadoop cluster is on 1.7 (I know ours here at Twitter 
isn't), so moving to 1.7 seems a little early.

Fixing bugs sounds like a good idea. :)

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Sophie Sperner (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665358#comment-13665358
 ] 

Sophie Sperner commented on MAHOUT-1225:


Thank you Suneel! What I think is important for a new 0.8 release: make it up 
to date with java 1.7 version and fix as more bugs as possible.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mahout-math-0.8-SNAPSHOT.jar
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665335#comment-13665335
 ] 

Jake Mannix commented on MAHOUT-1225:
-

To build from trunk (which is what we all do, for the most part), see the wiki, 
here: https://cwiki.apache.org/confluence/display/MAHOUT/BuildingMahout

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Sophie Sperner (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665276#comment-13665276
 ] 

Sophie Sperner commented on MAHOUT-1225:


I tried to follow 
http://stackoverflow.com/questions/16717858/how-to-make-jar-out-of-github-sources
but I get lots of errors and no jar file. Is it possible to make a jar file?

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665275#comment-13665275
 ] 

Ted Dunning commented on MAHOUT-1225:
-

{quote}
Would not it be a good idea to release 0.8 version with the most current code
{quote}

Yes.  In fact, we are already in the process of stabilizing for a release.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Sophie Sperner (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665250#comment-13665250
 ] 

Sophie Sperner commented on MAHOUT-1225:


Would not it be a good idea to release 0.8 version with the most current code 
especially since almost a year has passed.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Sophie Sperner (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665248#comment-13665248
 ] 

Sophie Sperner commented on MAHOUT-1225:


Dear, I go to http://repo1.maven.org/maven2/org/apache/mahout/mahout-math/0.7/ 
and download mahout-math-0.7.jar . Does it contain lots of bugs? What is the 
newest version? If you point me to github then I do not know how to make a jar 
file out of source. Could you please tell how to get most recent and stable 
code? I apologise if said bad words about old version.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> MAHOUT-1225.patch, mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665188#comment-13665188
 ] 

Jake Mannix commented on MAHOUT-1225:
-

Ah yes, we merged collections back into math a while back.  I'd love to have 
your test cases added in, more coverage == better!

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665185#comment-13665185
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Yeah... good to know where the current development takes place. I'll reapply 
this to the current code and will be back if any of these issues still apply.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665184#comment-13665184
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Darn... I think I've been working on an obsolete branch? :) Is this the current 
one?

http://svn.apache.org/repos/asf/mahout/trunk/math/src/main/java-templates/org/apache/mahout/math/

I think the old one should be removed or moved under an attic somewhere...

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665180#comment-13665180
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

Hi Jake. No idea, really. I just checked out 
http://svn.apache.org/repos/asf/mahout/collections/trunk and followed from 
there. I am subscribed to the list but I'm not following much of Mahout's 
development nowadays.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665174#comment-13665174
 ] 

Jake Mannix commented on MAHOUT-1225:
-

Wait, was this not _exactly_ the bug in 
https://issues.apache.org/jira/browse/MAHOUT-1186 ?

How did this creep back in?

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665109#comment-13665109
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

There are a bunch of issues, actually. In no particular order randomized 
testing against JUC allowed me to spot the following:

{code}
inconsistent c

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, MAHOUT-1225.patch, MAHOUT-1225.patch, 
> mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-22 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664918#comment-13664918
 ] 

Dawid Weiss commented on MAHOUT-1225:
-

No, this is not the right fix, Tom. That loop is correct as open addressing 
allows a "wrap around" in the array and this "if" clause does exactly that. The 
problem is the invariant of indexOf (and other methods in open addressing) is 
that there is at least one empty slot, which is not the case due to the bug I 
mentioned.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1225) Sets and maps incorrectly clear() their state arrays (potential endless loops)

2013-05-22 Thread Tom Marthaler (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664613#comment-13664613
 ] 

Tom Marthaler commented on MAHOUT-1225:
---

I looked at it. In the OpenIntHashSet.indexOfKey(int) method, the while loop 
never completes because the i variable is being decremented as well as 
incremented (line 226).

The fix is to move the
  if (i < 0) {
i += length;
  }
code out of the loop.

> Sets and maps incorrectly clear() their state arrays (potential endless loops)
> --
>
> Key: MAHOUT-1225
> URL: https://issues.apache.org/jira/browse/MAHOUT-1225
> Project: Mahout
>  Issue Type: Bug
>  Components: Math
>Affects Versions: 0.7
> Environment: Eclipse, linux Fedora 17, Java 1.7, Mahout Maths 
> collections (Set) 0.7, hppc 0.4.3
>Reporter: Sophie Sperner
>Assignee: Dawid Weiss
>  Labels: hashset, java, mahout, test
> Fix For: 0.7
>
> Attachments: hppc-0.4.3.jar, mushroom.dat
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The code I attached hangs on forever, Eclipse does not print me its stack 
> trace because it does not terminate the program. So I decided to make a small 
> test.java file that you can easily run.
> This code has the main function that simply runs getItemList() method which 
> successfully executes getDataset() method (here please download mushroom.dat 
> dataset and set the full path into filePath string variable) and the hangs on 
> (the problem happens on a fourth columnValues.add() call). After the dataset 
> was taken into X array, the code simply goes through X column by column and 
> searches for different items in it.
> If you uncomment IntSet columnValues = new IntOpenHashSet(); and 
> corresponding import headers then everything will work just fine (you will 
> also need to include hppc jar file found here 
> http://labs.carrotsearch.com/hppc.html or below in the attachment).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira