Hello, I'm with spark 2.1.0 with scala and I'm registering all classes with
kryo, and I have a problem registering this class,
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex$SerializableFileStatus$SerializableBlockLocation[]
I can't register with
classOf[Array[Class.forNam
do I have to make something like m1.multiply(m2).count().
Thanks.
--
--
Buen dia, alegria !!
José Francisco Saray Villamizar
cel +33 6 13710693
Lyon, France
matrix . ?
Is this time normal ?
Thank you.
--
--
Buen dia, alegria !!
José Francisco Saray Villamizar
cel +33 6 13710693
Lyon, France
We have got data stored in S3 partitioned by several columns. Let's say
following this hierarchy:
s3://bucket/data/column1=X/column2=Y/parquet-files
We run a Spark job in a EMR cluster (1 master,3 slaves) and realised the
following:
A) - When we declare the initial dataframe to be the whole datas
Hello,
Do you use df.write or you make with hivecontext.sql(" insert into ...")?
Angel.
El 12 jun. 2017 11:07 p. m., "Yong Zhang" escribió:
> We are using Spark *1.6.2* as ETL to generate parquet file for one
> dataset, and partitioned by "brand" (which is a string to represent brand
> in this
ns
Thanks,
Asmath
On Tue, May 2, 2017 at 1:38 PM, Angel Francisco Orta <
angel.francisco.o...@gmail.com> wrote:
> Have you tried to make partition by join's field and run it by segments,
> filtering both tables at the same segments of data?
>
> Example:
>
> Val ta
join on these tables now.
>
>
>
> On Tue, May 2, 2017 at 1:27 PM, Angel Francisco Orta <
> angel.francisco.o...@gmail.com> wrote:
>
>> Hello,
>>
>> Is the tables partitioned?
>> If yes, what is the partition field?
>>
>> Thanks
>>
>>
&
Hello,
Is the tables partitioned?
If yes, what is the partition field?
Thanks
El 2 may. 2017 8:22 p. m., "KhajaAsmath Mohammed"
escribió:
Hi,
I am trying to join two big tables in spark and the job is running for
quite a long time without any results.
Table 1: 192GB
Table 2: 92 GB
Does any
Ah yeah, didn't notice that difference.
Thanks! It worked.
On Fri, Apr 17, 2015 at 4:27 AM, Yin Huai wrote:
> For Map type column, fields['driver'] is the syntax to retrieve the map
> value (in the schema, you can see "fields: map"). The syntax of
> fields.driver is used for struct type.
>
> On
r in the first
place is because its rolap mode (direct query) is still too limited. And thanks
for writing the klout paper!! We were already using it as a guideline for our
tests.
Best regards,
Francisco
-Original Message-
From: "Denny Lee"
Sent: 22/02/2015 17:56
To:
command available for
windows.
Somebody knows if there is a way to make this work?
Thanks in advance!!
Francisco
Looks like this is a known issue:
https://issues.apache.org/jira/browse/SPARK-1353
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Size-exceeds-Integer-MAX-VALUE-in-BlockFetcherIterator-tp14483p14500.html
Sent from the Apache Spark User List mailing list ar
Hi,
We are running aggregation on a huge data set (few billion rows).
While running the task got the following error (see below). Any ideas?
Running spark 1.1.0 on cdh distribution.
...
14/09/17 13:33:30 INFO Executor: Finished task 0.0 in stage 1.0 (TID 0).
2083 bytes result sent to driver
14/09
Thanks for the tip.
http://localhost:4040/executors/ is showing
Executors(1)
Memory: 0.0 B used (294.9 MB Total)
Disk: 0.0 B Used
However, running as standalone cluster does resolve the problem.
I can see a worker process running w/ the allocated memory.
My conclusion (I may be wrong) is for 'l
Thanks for the reply.
I doubt that's the case though ... the executor kept having to do a file
dump because memory is full.
...
14/09/16 15:00:18 WARN ExternalAppendOnlyMap: Spilling in-memory map of 67
MB to disk (668 times so far)
14/09/16 15:00:21 WARN ExternalAppendOnlyMap: Spilling in-memor
Hi, I'm a Spark newbie.
We had installed spark-1.0.2-bin-cdh4 on a 'super machine' with 256gb memory
and 48 cores.
Tried to allocate a task with 64gb memory but for whatever reason Spark is
only using around 9gb max.
Submitted spark job with the following command:
"
/bin/spark-submit -class Sim
16 matches
Mail list logo