Thanks Manjunath, please take a look at line 64
https://github.com/apache/incubator-spot/blob/master/spot-ml/src/main/scala/org/apache/spot/proxy/ProxySuspiciousConnectsAnalysis.scala
I’m trying to get sample data but no luck for now. I will let you know if I get
some.
Thanks.
From:
Can you post your code and sample input?
That should help us understand if there is a bug in the code written or with
the platform.
Regards,
Kiran
From: "Barona, Ricardo"
Date: Friday, June 9, 2017 at 10:47 PM
To: "user@spark.apache.org"
In Spark 1.6.0 I’m having an issue with saveAsText and write.mode.text where I
have a data frame with 1M+ rows and then I do:
dataFrame.limit(500).map(_.mkString(“\t”)).toDF(“row”).write.mode(SaveMode.Overwrite).text(“myHDFSFolder/results”)
then when I check for the results file, I see 900+
Thanks a lot for your quick help !
Further, I have 2 more points:
a) I heard from my colleagues that if my Scala code had RDD then I need to
replace with datasets / dataframes. Why is that ?
b) One of the operator saveasTextFile is taking a long time. What would be
the probable cause and
also, read the newest book of Holden on High-Performance Spark:
http://shop.oreilly.com/product/0636920046967.do
On Fri, Jun 9, 2017 at 5:38 PM, Alonso Isidoro Roman
wrote:
> a quick search on google:
>
> https://www.cloudera.com/documentation/enterprise/5-9-
>
a quick search on google:
https://www.cloudera.com/documentation/enterprise/5-9-x/topics/admin_spark_tuning.html
https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
and of course,
the bug is related to where long checkpoints are truncated when dealing with
topics have large number of partitions, in my case 120.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/StructuredStreaming-StreamingQueryException-tp28749p28754.html
Sent from the
this is a bug in spark version 2.1.0, seems to be fixed in spark 2.1.1 when
ran with that version.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/StructuredStreaming-StreamingQueryException-tp28749p28753.html
Sent from the Apache Spark User List mailing
hi
*def tallSkinnyQR(computeQ: Boolean = false): QRDecomposition[RowMatrix,
Matrix]*
*In output of this method Q is distributed matrix*
*and R is local Matrix*
*Whats the reason R is Local Matrix?*
-Arun
Hi,
I need some help / guidance in performance tuning
Spark code written in Scala. Can you please help.
Thanks
Hi Takeshi,
Thank you very much.
Regards,
Chanh
On Thu, Jun 8, 2017 at 11:05 PM Takeshi Yamamuro
wrote:
> I filed a jira about this issue:
> https://issues.apache.org/jira/browse/SPARK-21024
>
> On Thu, Jun 8, 2017 at 1:27 AM, Chanh Le wrote:
>
>>
Hello, Ranadip!
I tried your solution, but still have no results. Also I didn’t find
anything in logs.
Kerberos disabled, dfs.permissions = false.
Thanks.
2017-06-08 20:52 GMT+03:00 Ranadip Chatterjee :
> Looks like your session user does not have the required privileges on
Thank you for your response.
Yes, I tried this solution, and it works fine, but this solution for
collocated hive cluster.
I need to query more then one remote clusters in one spark session, and due
to this I need to use connection over jdbc.
Maybe you know how to query more then one remote server
13 matches
Mail list logo