Try use: org.apache.hadoop.hive.serde2.RegexSerDe
GP
On 27 Jul 2015, at 09:35, ZhuGe mailto:t...@outlook.com>>
wrote:
Hi all:
I am testing the performance of hive on spark sql.
The existing table is created with
ROW FORMAT
SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPE
...@sigmoidanalytics.com>> wrote:
Not quiet sure, but try pointing the spark.history.fs.logDirectory to your s3
Thanks
Best Regards
On Tue, Jun 16, 2015 at 6:26 PM, Gianluca Privitera
mailto:gianluca.privite...@studio.unibo.it>>
wrote:
In Spark website it’s stated in the View After the
In Spark website it’s stated in the View After the Fact section
(https://spark.apache.org/docs/latest/monitoring.html) that you can point the
start-history-server.sh script to a directory in order do view the Web UI using
the logs as data source.
Is it possible to point that script to S3? Maybe
Hi,
I’ve got a problem with Spark Streaming and tshark.
While I’m running locally I have no problems with this code, but when I run it
on a EC2 cluster I get the exception shown just under the code.
def dissection(s: String): Seq[String] = {
try {
Process("hadoop command to create ./lo
You should think about a custom receiver, in order to solve the problem of the
“already collected” data.
http://spark.apache.org/docs/latest/streaming-custom-receivers.html
Gianluca
On 04 Jul 2014, at 15:46, alessandro finamore
mailto:alessandro.finam...@polito.it>> wrote:
Hi,
I have a large
/api/java/org/apache/spark/sql/api/java/JavaSchemaRDD.html>
sql(String sqlQuery)
Executes a query expressed in SQL, returning the result as a JavaSchemaRDD
but what kind of sqlQuery we can execute, is there any more documentation?
Xiaobo Gu
-- Original ------
From
You can find something in the API, nothing more than that I think for now.
Gianluca
On 25 Jun 2014, at 23:36, guxiaobo1982 wrote:
> Hi,
>
> I want to know the full list of functions, syntax, features that Spark SQL
> supports, is there some documentations.
>
>
> Regards,
>
> Xiaobo Gu
If you are launching your application with spark-submit you can manually edit
the spark-class file to make it 1g as baseline.
It’s pretty easy to do and to figure out how once you open the file.
This worked for me even if it’s not a final solution of course.
Gianluca
On 12 Jun 2014, at 15:16, e
You can use ForeachRDD then access RDD data.
Hope this works for you.
Gianluca
On 12 Jun 2014, at 10:06, Wolfinger, Fred
mailto:fwolfin...@cyberpointllc.com>> wrote:
Good morning.
I have a question related to Spark Streaming. I have reduced some data down to
a simple count value (by window),
Hi,
I'm think I may have encountered some kind of bug that at the moment prevents
the correct running of my application on a EC2 Cluster.
I'm saying that because the same exact code works wonderfully locally but has a
really strange behaviour on the cluster.
val uri = ssc.textFileStream(args(1) +
Is anyone experiencing problems with windows?
dstream1.print()
val dstream2 = dstream1.groupByKeyAndWindow(Seconds(60))
dstream2.print()
In my appslication the first print() prints out all the strings and
their keys, but after the window function everything is lost and
nothings gets printed.
ur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>
On Fri, Jun 6, 2014 at 3:00 AM, Gianluca Privitera
<mailto:gianluca.privite...@studio.unibo.it>> wrote:
Hi,
I've got a weird question but maybe
Hi,
I think the best thing you could do is run an empty AMI with that ID,
add the stuff you want to add, then copy it through the AWS console,
then launch the ec2 script using the new AMI you just created.
On 06/06/2014 09:20, Akhil Das wrote:
Hi Matt,
You will be needing the following on th
Hi,
I've got a weird question but maybe someone has already dealt with it.
My Spark Streaming application needs to
- download a file from a S3 bucket,
- run a script with the file as input,
- create a DStream from this script output.
I've already got the second part done with the rdd.pipe() API th
Hi,
if you say you correctly setted your access key id and secret access key
then probably it's a problem related to the key.pem file.
Try generate a new one, and be sure to be the only one with the right to
read it or it wont work.
Gianluca
On 04/06/2014 09:45, Sam Taylor Steyer wrote:
Hi,
Hi everyone,
I would like to setup a very simple cluster (specifically using 2 micro
instances only) of Spark on EC2 and make it run a simple Spark Streaming
application I created.
Someone actually managed to do that?
Because after launching the scripts from this page:
http://spark.apache.org/
16 matches
Mail list logo