Re: Collecting matrix's entries raises an error only when run inside a test

2017-07-06 Thread Yanbo Liang
Hi Simone,

Would you mind to share the minimized code to reproduce this issue?

Yanbo

On Wed, Jul 5, 2017 at 10:52 PM, Simone Robutti 
wrote:

> Hello, I have this problem and  Google is not helping. Instead, it looks
> like an unreported bug and there are no hints to possible workarounds.
>
> the error is the following:
>
> Traceback (most recent call last):
>   File 
> "/home/simone/motionlogic/trip-labeler/test/trip_labeler_test/model_test.py",
> line 43, in test_make_trip_matrix
> entries = trip_matrix.entries.map(lambda entry: (entry.i, entry.j,
> entry.value)).collect()
>   File "/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
> line 770, in collect
> with SCCallSiteSync(self.context) as css:
>   File 
> "/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/traceback_utils.py",
> line 72, in __enter__
> self._context._jsc.setCallSite(self._call_site)
> AttributeError: 'NoneType' object has no attribute 'setCallSite'
>
> and it is raised when I try to collect a 
> pyspark.mllib.linalg.distributed.CoordinateMatrix
> entries with .collect() and it happens only when this run in a test suite
> with more than one class, so it's probably related to the creation and
> destruction of SparkContexts but I cannot understand how.
>
> Spark version is 1.6.2
>
> I saw multiple references to this error for other classses in the pyspark
> ml library but none of them contained hints toward the solution.
>
> I'm running tests through nosetests when it breaks. Running a single
> TestCase in Intellij works fine.
>
> Is there a known solution? Is it a known problem?
>
> Thank you,
>
> Simone
>


Collecting matrix's entries raises an error only when run inside a test

2017-07-05 Thread Simone Robutti
Hello, I have this problem and  Google is not helping. Instead, it looks
like an unreported bug and there are no hints to possible workarounds.

the error is the following:

Traceback (most recent call last):
  File
"/home/simone/motionlogic/trip-labeler/test/trip_labeler_test/model_test.py",
line 43, in test_make_trip_matrix
entries = trip_matrix.entries.map(lambda entry: (entry.i, entry.j,
entry.value)).collect()
  File
"/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 770, in collect
with SCCallSiteSync(self.context) as css:
  File
"/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/traceback_utils.py",
line 72, in __enter__
self._context._jsc.setCallSite(self._call_site)
AttributeError: 'NoneType' object has no attribute 'setCallSite'

and it is raised when I try to collect a
pyspark.mllib.linalg.distributed.CoordinateMatrix entries with .collect()
and it happens only when this run in a test suite with more than one class,
so it's probably related to the creation and destruction of SparkContexts
but I cannot understand how.

Spark version is 1.6.2

I saw multiple references to this error for other classses in the pyspark
ml library but none of them contained hints toward the solution.

I'm running tests through nosetests when it breaks. Running a single
TestCase in Intellij works fine.

Is there a known solution? Is it a known problem?

Thank you,

Simone