I can assess directly in China
> On Nov 13, 2015, at 10:28 AM, Ted Yu wrote:
>
> I was able to access the following where response was fast:
>
> https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-with-YARN
>
This is the most simplest announcement I saw.
> On Nov 11, 2015, at 12:49 AM, Reynold Xin wrote:
>
> Hi All,
>
> Spark 1.5.2 is a maintenance release containing stability fixes. This release
> is based on the branch-1.5 maintenance branch of Spark. We *strongly
>
t;
> Can you check your json input ?
>
> Thanks
>
> On Sat, Sep 12, 2015 at 2:05 AM, Fengdong Yu <fengdo...@everstring.com>
> wrote:
>
>> Hi,
>>
>> I am using spark1.4.1 data frame, read JSON data, then save it to orc.
>> the code is v
> u'key1': u'value1'}, {u'key2': u'value2', 'source':
> u'hdfs://localhost:9000/user/hduser/test/dt=20100102.json'}]
>
> Similarly you could modify the function to return 'source' and 'date' with
> some string manipulation per your requirements.
>
> Let me know if this helps.
>
Anchit,
please ignore my inputs. you are right. Thanks.
> On Sep 26, 2015, at 17:27, Fengdong Yu <fengdo...@everstring.com> wrote:
>
> Hi Anchit,
>
> this is not my expected, because you specified the HDFS directory in your
> code.
> I've solved like
Hi Anchit,
cat you create more than one data in each dataset to test again?
> On Sep 26, 2015, at 18:00, Fengdong Yu <fengdo...@everstring.com> wrote:
>
> Anchit,
>
> please ignore my inputs. you are right. Thanks.
>
>
>
>> On Sep 26, 2015, at 17:27, F
Do you mean you want to publish the artifact to your private repository?
if so, please using ‘sbt publish’
add the following in your build.sb:
publishTo := {
val nexus = "https://YOUR_PRIVATE_REPO_HOSTS/;
if (version.value.endsWith("SNAPSHOT"))
Some("snapshots" at nexus +
gt;
>
>
> Scala
>
> val rdd = sparkContext.wholeTextFile("hdfs://a-hdfs-path")
>
> More info:
> https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.SparkContext@wholeTextFiles(String,Int):RDD[(String,String)]
>
> Let us kn
Hi,
I have multiple files with JSON format, such as:
/data/test1_data/sub100/test.data
/data/test2_data/sub200/test.data
I can sc.textFile(“/data/*/*”)
but I want to add the {“source” : “HDFS_LOCATION”} to each line, then save it
the one target HDFS location.
how to do it, Thanks.
bring clarity to my thoughts?
>
> On Thu, Sep 24, 2015, 23:44 Fengdong Yu <fengdo...@everstring.com
> <mailto:fengdo...@everstring.com>> wrote:
> Hi Anchit,
>
> Thanks for the quick answer.
>
> my exact question is : I want to add HDFS location into each line in
I don’t think there is performance difference between 1.x API and 2.x API.
but it’s not a big issue for your change, only
com.databricks.hadoop.mapreduce.lib.input.XmlInputFormat.java
Hi,
I filed an issue, please take a look:
https://issues.apache.org/jira/browse/SPARK-12233
It definitely can be reproduced.
Hi,
I found a very minor typo in:
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf
Page 4:
We complement the data mining example in Section 2.2.1 with two iterative
applications: logistic regression and PageRank.
I read back to section 2.2.1, there is no these two examples.
hiveContext.read.format(“orc”).load(“bypath/*”)
> On Nov 24, 2015, at 1:07 PM, Renu Yadav wrote:
>
> Hi ,
>
> I am using dataframe and want to load orc file using multiple directory
> like this:
> hiveContext.read.format.load("mypath/3660,myPath/3661")
>
> but it is not
14 matches
Mail list logo