I will use a portion of data and try. will the hdfs block affect
spark?(if so, it's hard to reproduce)
On Wed, Dec 30, 2015 at 3:22 AM, Joseph Bradley wrote:
> Hi Li,
>
> I'm wondering if you're running into the same bug reported here:
> https://issues.apache.org/jira/browse/SPARK-12488
>
> I hav
Hi fellas,
I am new to spark and I have a newbie question. I am currently reading the
source code in spark sql catalyst analyzer. I not quite understand the partial
function in PullOutNondeterministric. What does it mean by "pull out”? Why do
we have to do the "pulling out”?
I would really appre
OK to close the loop - this thread has nothing to do with Spark?
On Tue, Dec 29, 2015 at 9:55 PM, Ted Yu wrote:
> Oops, wrong list :-)
>
> On Dec 29, 2015, at 9:48 PM, Reynold Xin wrote:
>
> +Herman
>
> Is this coming from the newly merged Hive parser?
>
>
>
> On Tue, Dec 29, 2015 at 9:46 PM,
Oops, wrong list :-)
> On Dec 29, 2015, at 9:48 PM, Reynold Xin wrote:
>
> +Herman
>
> Is this coming from the newly merged Hive parser?
>
>
>
>> On Tue, Dec 29, 2015 at 9:46 PM, Allen Zhang wrote:
>>
>>
>> format issue I think, go ahead
>>
>>
>>
>>
>> At 2015-12-30 13:36:05, "Ted Yu"
+Herman
Is this coming from the newly merged Hive parser?
On Tue, Dec 29, 2015 at 9:46 PM, Allen Zhang wrote:
>
>
> format issue I think, go ahead
>
>
>
>
> At 2015-12-30 13:36:05, "Ted Yu" wrote:
>
> Hi,
> I noticed that there are a lot of checkstyle warnings in the following
> form:
>
> s
format issue I think, go ahead
At 2015-12-30 13:36:05, "Ted Yu" wrote:
Hi,
I noticed that there are a lot of checkstyle warnings in the following form:
To my knowledge, we use two spaces for each tab. Not sure why all of a sudden
we have so many IndentationCheck warnings:
grep 'hil
Hi,
I noticed that there are a lot of checkstyle warnings in the following form:
To my knowledge, we use two spaces for each tab. Not sure why all of a
sudden we have so many IndentationCheck warnings:
grep 'hild have incorrect indentati' trunkCheckstyle.xml | wc
3133 52645 678294
If th
Hi Jan, could you post your codes? I could not reproduce this issue in my
environment.
Best Regards,
Shixiong Zhu
2015-12-29 10:22 GMT-08:00 Shixiong Zhu :
> Could you create a JIRA? We can continue the discussion there. Thanks!
>
> Best Regards,
> Shixiong Zhu
>
> 2015-12-29 3:42 GMT-08:00 Jan
If you use hadoopFile (or textFile) and have the same file on the same path
in every node, I suspect it might just work.
On Tue, Dec 29, 2015 at 3:57 AM, Disha Shrivastava
wrote:
> Hi,
>
> Suppose I have a file locally on my master machine and the same file is
> also present in the same path on
RDD is collection of object And if these objects are mutable and changed
then the same will reflect in RDD.
For immutable objects it will not. Changing the mutable objects that are in
the RDD is not right practise.
The RDD is immutable in the sense that any transformation on the RDD will
result i
You can, but you shouldn't. Using backdoors to mutate the data in an RDD
is a good way to produce confusing and inconsistent results when, e.g., an
RDD's lineage needs to be recomputed or a Task is resubmitted on fetch
failure.
On Tue, Dec 29, 2015 at 11:24 AM, ai he wrote:
> Same thing.
>
> Sa
Same thing.
Say, your underlying structure is like Array(ArrayBuffer(1, 2),
ArrayBuffer(3, 4)).
Then you can add/remove data in ArrayBuffers and then the change will
be reflected in the rdd.
On Tue, Dec 29, 2015 at 11:19 AM, salexln wrote:
> I see, so in order the RDD to be completely immutab
Hi Li,
I'm wondering if you're running into the same bug reported here:
https://issues.apache.org/jira/browse/SPARK-12488
I haven't figured out yet what is causing it. Do you have a small corpus
which reproduces this error, and which you can share on the JIRA? If so,
that would help a lot in de
I see, so in order the RDD to be completely immutable, its content should be
immutable as well.
And if the content is not immutable, we can change its content, but cannot
add / remove data?
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-Vector-I
Hi salexln,
RDD's immutability depends on the underlying structure. I have the
following example.
--
scala> val m = Array.fill(2, 2)(0)
m: Array[Array[Int]] = Array(Array(0, 0), Array(0
bq. had configured the pull request builder timeout to be 300 minutes (5
hours)
From
https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=spark-test/4592/
:
- 1 hr 55 min building on an executor;
- 1 hr 55 min total from scheduled to comple
Hi Josh,
Your HiveThriftBinaryServerSuite fix wasn't in the build I was running (I
forgot to merge the latest master). So it might actually work.
As for stopping the build, it is understandable that you cannot do that
without the proper permissions. It would still be cool to be able to issue
a 's
Could you create a JIRA? We can continue the discussion there. Thanks!
Best Regards,
Shixiong Zhu
2015-12-29 3:42 GMT-08:00 Jan Uyttenhove :
> Hi guys,
>
> I upgraded to the RC4 of Spark (streaming) 1.6.0 to (re)test the new
> mapWithState API, after previously reporting issue SPARK-11932 (
> ht
Yeah, I thought that my quick fix might address the
HiveThriftBinaryServerSuite hanging issue, but it looks like it didn't work
so I'll now have to do the more principled fix of using a UDF which sleeps
for some amount of time.
In order to stop builds, you need to have a Jenkins account with the p
Thanks. I'll merge the most recent master...
Still curious if we can stop a build.
Kind regards,
Herman van Hövell tot Westerflier
2015-12-29 18:59 GMT+01:00 Ted Yu :
> HiveThriftBinaryServerSuite got stuck.
>
> I thought Josh has fixed this issue:
>
> [SPARK-11823][SQL] Fix flaky JDBC cancell
HiveThriftBinaryServerSuite got stuck.
I thought Josh has fixed this issue:
[SPARK-11823][SQL] Fix flaky JDBC cancellation test in
HiveThriftBinaryServerSuite
On Tue, Dec 29, 2015 at 9:56 AM, Herman van Hövell tot Westerflier <
hvanhov...@questtec.nl> wrote:
> My AMPLAB jenkins build has been s
My AMPLAB jenkins build has been stuck for a few hours now:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48414/consoleFull
Is there a way for me to stop the build?
Kind regards,
Herman van Hövell
Hi Alexander,
Thanks a lot for your response.Yes, I am considering the use case when the
weight matrix is too large to fit into the main memory of a single machine.
Can you tell me ways of dividing the weight matrix? According to my
investigations so far, we can do this by two ways:
1. By parall
Hi,
Suppose I have a file locally on my master machine and the same file is
also present in the same path on all the worker machines , say
/home/user_name/Desktop. I wanted to know that when we partition the data
using sc.parallelize , Spark actually broadcasts parts of the RDD to all
the worker m
Hi guys,
I upgraded to the RC4 of Spark (streaming) 1.6.0 to (re)test the new
mapWithState API, after previously reporting issue SPARK-11932 (
https://issues.apache.org/jira/browse/SPARK-11932).
My Spark streaming job involves reading data from a Kafka topic (using
KafkaUtils.createDirectStream),
25 matches
Mail list logo