)
--
*From:* Sean Owen so...@cloudera.com
*To:* Michael Albert m_albert...@yahoo.com
*Cc:* user@spark.apache.org user@spark.apache.org
*Sent:* Monday, February 2, 2015 10:13 PM
*Subject:* Re: 2GB limit for partitions?
The limit is on blocks, not partitions. Partitions have many
(NettyBlockRpcServer.scala:57)
--
*From:* Sean Owen so...@cloudera.com
*To:* Michael Albert m_albert...@yahoo.com
*Cc:* user@spark.apache.org user@spark.apache.org
*Sent:* Monday, February 2, 2015 10:13 PM
*Subject:* Re: 2GB limit for partitions?
The limit
*To:* Michael Albert m_albert...@yahoo.com
*Cc:* user@spark.apache.org user@spark.apache.org
*Sent:* Monday, February 2, 2015 10:13 PM
*Subject:* Re: 2GB limit for partitions?
The limit is on blocks, not partitions. Partitions have many blocks.
It sounds like you are creating very large values
: 2GB limit for partitions?
To be clear, there is no distinction between partitions and blocks for RDD
caching (each RDD partition corresponds to 1 cache block). The distinction is
important for shuffling, where by definition N partitions are shuffled into M
partitions, creating N*M
)
at
org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:57)
--
*From:* Sean Owen so...@cloudera.com
*To:* Michael Albert m_albert...@yahoo.com
*Cc:* user@spark.apache.org user@spark.apache.org
*Sent:* Monday, February 2, 2015 10:13 PM
*Subject:* Re: 2GB limit for partitions?
The limit
...@cloudera.com
*To:* Michael Albert m_albert...@yahoo.com
*Cc:* user@spark.apache.org user@spark.apache.org
*Sent:* Monday, February 2, 2015 10:13 PM
*Subject:* Re: 2GB limit for partitions?
The limit is on blocks, not partitions. Partitions have many blocks.
It sounds like you are creating
*To:* Michael Albert m_albert...@yahoo.com
*Cc:* user@spark.apache.org user@spark.apache.org
*Sent:* Monday, February 2, 2015 10:13 PM
*Subject:* Re: 2GB limit for partitions?
The limit is on blocks, not partitions. Partitions have many blocks.
It sounds like you are creating very large values
) at
org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:57)
From: Sean Owen so...@cloudera.com
To: Michael Albert m_albert...@yahoo.com
Cc: user@spark.apache.org user@spark.apache.org
Sent: Monday, February 2, 2015 10:13 PM
Subject: Re: 2GB limit for partitions?
The limit is on blocks
:;
*Sent:* Monday, February 2, 2015 10:13 PM
*Subject:* Re: 2GB limit for partitions?
The limit is on blocks, not partitions. Partitions have many blocks.
It sounds like you are creating very large values in memory, but I'm
not sure given your description. You will run into problems
Greetings!
SPARK-1476 says that there is a 2G limit for blocks.Is this the same as a 2G
limit for partitions (or approximately so?)?
What I had been attempting to do is the following.1) Start with a moderately
large data set (currently about 100GB, but growing).2) Create about 1,000 files
The limit is on blocks, not partitions. Partitions have many blocks.
It sounds like you are creating very large values in memory, but I'm
not sure given your description. You will run into problems if a
single object is more than 2GB, of course. More of the stack trace
might show what is mapping
11 matches
Mail list logo