Hello,
Using Spark 2.2.0. Interested in seeing the action of dynamic topic
subscription.
Tried this example: streaming.DirectKafkaWordCount (which uses
org.apache.spark.streaming.kafka010)
I start with 8 Kafka partitions in my topic and found that Spark Streaming
executes 8 tasks (one per par
Hi all,
I am trying to do some image analytics type workload using Spark. The
images are read in JPEG format and then are converted to the raw format in
map functions and this causes the size of the partitions to grow by an
order of 1. In addition to this, I am caching some of the data because my
Hi All,
There are several categorical columns in my dataset as follows:
[image: Inline images 1]
How can I transform values in each (categorical) columns into numeric using
StringIndexer so that the resulting DataFrame can be feed into
VectorAssembler to generate a feature vector?
A naive approa
Yes I checked both the output location and console too. It doesnt have any
data.
link also has the code and question that I have raised with Azure
HDInsights.
https://github.com/Azure/spark-eventhubs/issues/195
On Fri, Oct 27, 2017 at 3:22 PM, Shixiong(Ryan) Zhu wrote:
> The codes in the link
May I ask what is the use case? Although it is a very interesting question, but
I would be concerned about going further than a proof of concept. A lot of the
enterprises I see and visit are barely on Java8, so starting to talk JDK 9
might be a slight overkill but if you have a good story, I’m a
The codes in the link write the data into files. Did you check the output
location?
By the way, if you want to see the data on the console, you can use the
console sink by changing this line *format("parquet").option("path",
outputPath + "/ETL").partitionBy("creationTime").start()* to
*format("con
Certainly, Scala 2.12 support precedes Java 9 support. A lot of the work is
in place already, and the last issue is dealing with how Scala closures are
now implemented quite different with lambdas / invokedynamic. This affects
the ClosureCleaner. For the interested, this is as far as I know the mai
Scala 2.12 is not yet supported on Spark - this means also not JDK9:
https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-14220
If you look at the Oracle support then jdk 9 is anyway only supported for 6
months. JDK 8 is Lts (5 years) JDK 18.3 will be only 6 months and JDK 18.9 is
l
Hi TathagataDas,
I was trying to use eventhub with spark streaming. Looks like I was able to
make connection successfully but cannot see any data on the console. Not
sure if eventhub is supported or not.
https://github.com/Azure/spark-eventhubs/blob/master/examples/src/main/scala/com/microsoft/sp
I was looking at this example but didnt get any output from it when used.
https://github.com/Azure/spark-eventhubs/blob/master/examples/src/main/scala/com/microsoft/spark/sql/examples/EventHubsStructuredStreamingExample.scala
On Fri, Oct 27, 2017 at 9:18 AM, ayan guha wrote:
> Does event hub
Does event hub support seuctured streaming at all yet?
On Fri, 27 Oct 2017 at 1:43 pm, KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:
> Hi,
>
> Could anyone share if there is any code snippet on how to use spark
> structured streaming with event hubs ??
>
> Thanks,
> Asmath
>
> Sent from
What you have is sequential and hence sequential processing.
Also Spark/Scala are not parallel programming languages.
But even if they were, statements are executed sequentially unless you exploit
the parallel/concurrent execution features.
Anyway, see if this works:
val (RDD1, RDD2) = (JavaFunc
I found a workaround, when I create Hive Table using Spark “saveAsTable”, I see
filters being pushed down.
-> other approaches I tried where filters are not pushed down Is,
1) when I create Hive Table upfront and load orc into it using Spark SQL
2) when I create orc files using spark SQL and t
I have spark job to compute the similarity between text documents:
RowMatrix rowMatrix = new RowMatrix(vectorsRDD.rdd());
CoordinateMatrix
rowsimilarity=rowMatrix.columnSimilarities(0.5);JavaRDD
entries = rowsimilarity.entries().toJavaRDD();
List list = entries.collect();
for(MatrixEntry s : list)
14 matches
Mail list logo