Hi,
I am running some tpcds queries (data is Parquet stored in hdfs) with spark 2.0
rc5 and for some queries I get this OOM:
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at
org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder.grow(BufferHolder.java:73)
e/spark/pull/13775
> <https://github.com/apache/spark/pull/13775>
>
> Thanks!
>
> 2016-07-25 19:01 GMT+09:00 Ovidiu-Cristian MARCU
> <ovidiu-cristian.ma...@inria.fr <mailto:ovidiu-cristian.ma...@inria.fr>>:
> Hi,
>
> Assuming I have some
Hi,
Assuming I have some data in both ORC/Parquet formats, and some complex
workflow that eventually combine results of some queries on these datasets, I
would like to get the best execution and looking at the default configs I
noticed:
1) Vectorized query execution possible with Parquet
Hi,
I have a TPCDS query that fails in the stage 80 which is a ResultStage
(SparkSQL).
Ideally I would like to ‘checkpoint’ a previous stage which was executed
successfully and replay the failed stage for debug purposes.
Anyone managed to do something similar that could point some hints?
Maybe
.akka.frameSize 128
spark.shuffle.manager sort
> On 14 Jun 2016, at 00:12, Sameer Agarwal <sam...@databricks.com> wrote:
>
> I'm unfortunately not able to reproduce this on master. Does the query always
> fail deterministically?
>
> On Mon, Jun 13,
Yes, commit ad102af
> On 13 Jun 2016, at 21:25, Reynold Xin <r...@databricks.com> wrote:
>
> Did you try this on master?
>
>
> On Mon, Jun 13, 2016 at 11:26 AM, Ovidiu-Cristian MARCU
> <ovidiu-cristian.ma...@inria.fr <mailto:ovidiu-cristian.ma...@inria.
Hi,
Running the first query of tpcds on a standalone setup (4 nodes, tpcds2
generated for scale 10 and transformed in parquet under hdfs) it results in
one exception [1].
Close to this problem I found this issue
https://issues.apache.org/jira/browse/SPARK-12089
+1 for moving this discussion to a proactive new (alpha/beta) release of Apache
Spark 2.0!
> On 06 Jun 2016, at 20:25, Ovidiu Cristian Marcu <oma...@inria.fr> wrote:
>
> Any chance to start preparing a new alpha/beta release for 2.0 this month or
> the preview will
Hi all
IMHO the preview ‘release’ is good at is is now, so no further changes required.
For me the preview was a trigger to what will be the next Spark 2.0, really
appreciate the effort guys made to describe it and market it:)
I’ll appreciate if the Apache Spark team will start a vote for a new
Do you need more information?
> On 23 May 2016, at 19:16, Ovidiu-Cristian MARCU
> <ovidiu-cristian.ma...@inria.fr> wrote:
>
> Yes,
>
> git log
> commit dafcb05c2ef8e09f45edfb7eabf58116c23975a0
> Author: Sameer Agarwal <sam...@databricks.com <mailto:sam...@d
On 23 May 2016, at 18:16, Ted Yu <yuzhih...@gmail.com> wrote:
>
> Can you tell us the commit hash using which the test was run ?
>
> For #2, if you can give full stack trace, that would be nice.
>
> Thanks
>
> On Mon, May 23, 2016 at 8:58 AM, Ovidiu-Cris
Hi
1) Using latest spark 2.0 I've managed to run TPCDSQueryBenchmark first 9
queries and then it ends in the OutOfMemoryError [1].
What was the configuration used for running this benchmark? Can you explain the
meaning of 4 shuffle partitions? Thanks!
On my local system I use:
> On Mon, May 23, 2016 at 2:16 AM, Ovidiu-Cristian MARCU
> <ovidiu-cristian.ma...@inria.fr <mailto:ovidiu-cristian.ma...@inria.fr>>
> wrote:
> Hi
>
> I have the following issue when trying to build the latest spark source code
> on master:
>
> /spark/com
Hi
I have the following issue when trying to build the latest spark source code on
master:
/spark/common/network-common/src/main/java/org/apache/spark/network/util/JavaUtils.java:147:
error: cannot find symbol
[error] if (process != null && process.isAlive()) {
[error]
find that by changing the filter to target version = 2.0.0. Cheers.
>
> On Wed, May 18, 2016 at 9:00 AM, Ovidiu-Cristian MARCU
> <ovidiu-cristian.ma...@inria.fr <mailto:ovidiu-cristian.ma...@inria.fr>>
> wrote:
> +1 Great, I see the list of resolved issues, do yo
+1 Great, I see the list of resolved issues, do you have a list of known issue
you plan to stay with this release?
with
build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Phive -Phive-thriftserver
-DskipTests clean package
mvn -version
Apache Maven 3.3.9
16 matches
Mail list logo