Hi
We have just tested the new Spark 2.1.1 release, and observe an issue where
the driver program hangs when making predictions using a random forest. The
issue disappears when downgrading to 2.1.0.
Have anyone observed similar issues? Recommendations on how to dig into this
would also be much
For anyone revisiting this at a later point, the issue was that Spark 2.1.0
upgrades netty to version 4.0.42 which is not binary compatible with version
4.0.37 used by version 3.1.0 of the Cassandra Java Driver. The newer version
can work with Cassandra, but because of differences in the maven
I am also experiencing this. Do you have a JIRA on it?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Error-PartitioningCollection-requires-all-of-its-partitionings-have-the-same-numPartitions-tp27875p28272.html
Sent from the Apache Spark User List mailing
Hi
We just tested a switch from Spark 2.0.2 to Spark 2.1.0 on our codebase. It
compiles fine, but introduces the following runtime exception upon
initialization of our Cassandra database. I can't find any clues in the
release notes. Has anyone experienced this?
Morten
sbt.ForkMain$ForkError:
Hi
I have spent quite some time trying to debug an issue with the Random Forest
algorithm on Spark 2.0.2. The input dataset is relatively large at around
600k rows and 200MB, but I use subsampling to make each tree manageable.
However even with only 1 tree and a low sample rate of 0.05 the job
I can't find any JIRA issues with the tag that are unresolved. Apologies if
this is a rookie mistake and the information is available elsewhere.
Morten
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Any-estimate-for-a-Spark-2-0-1-release-date-tp27659.html
I dont think thats the issue. It sound very much like this
https://issues.apache.org/jira/browse/SPARK-16664
Morten
> Den 20. aug. 2016 kl. 21.24 skrev ponkin [via Apache Spark User List]
> :
>
> Did you try to load wide, for example, CSV file or
Cassandra.
Morten
> Den 20. aug. 2016 kl. 13.53 skrev ponkin [via Apache Spark User List]
> :
>
> Hi,
> What kind of datasource do you have? CSV, Avro, Parquet?
>
> If you reply to this email, your message will be added to the discussion
> below:
>
I did some extra digging. Running the query "select column1 from myTable" I
can reproduce the problem on a frame with a single row - it occurs exactly
when the frame has more than 200 columns, which smells a bit like a
hardcoded limit.
Interestingly the problem disappears when replacing the query
Hi
We currently have some workloads in Spark 1.6.2 with queries operating on a
data frame with 1500+ columns (17000 rows). This has never been quite
stable, and some queries, such as "select *" would yield empty result sets,
but queries restricting to specific columns have mostly worked. Needless
Hi
We are currently running a setup with Spark 1.6.2 inside Docker. It requires
the use of the HTTPBroadcastFactory instead of the default
TorrentBroadcastFactory to avoid the use of random ports, that cannot be
exposed through Docker. From the Spark 2.0 release notes I can see that the
11 matches
Mail list logo