Hello,
I noticed I can run spark applications with a local master via sbt run
and also via the IDE. I'd like to run a single threaded worker
application as a self contained jar.
What does sbt run employ that allows it to run a local master?
Can I build an uber jar and run without spark-submit?
this, particularly without using multiple
streams?
On Wed, Dec 26, 2018 at 6:01 PM Colin Williams
wrote:
>
> https://stackoverflow.com/questions/53938967/writing-corrupt-data-from-kafka-json-datasource-in-spark-structured-streaming
>
> On Wed, Dec 26, 2018 at 2:42 PM Colin Willi
https://stackoverflow.com/questions/53938967/writing-corrupt-data-from-kafka-json-datasource-in-spark-structured-streaming
On Wed, Dec 26, 2018 at 2:42 PM Colin Williams
wrote:
>
> From my initial impression it looks like I'd need to create my own
> `from_json` using `json
>From my initial impression it looks like I'd need to create my own
`from_json` using `jsonToStructs` as a reference but try to handle `
case : BadRecordException => null ` or similar to try to write the non
matching string to a corrupt records column
On Wed, Dec 26, 2018 at 1:55 PM
Hi,
I'm trying to figure out how I can write records that don't match a
json read schema via spark structred streaming to an output sink /
parquet location. Previously I did this in batch via corrupt column
features of batch. But in this spark structured streaming I'm reading
from kafka a string
g-context
>
> Best,
> Anastasios
>
> On Mon, Dec 24, 2018 at 10:29 PM Colin Williams
> wrote:
>>
>> I've been trying to read from kafka via a spark streaming client. I
>> found out spark cluster doesn't have certificates deployed. Then I
>> tried using the
I've been trying to read from kafka via a spark streaming client. I
found out spark cluster doesn't have certificates deployed. Then I
tried using the same local certificates I've been testing with by
packing them in an uber jar and getting a File handle from the
Classloader resource. But I'm
Looks like it's been reported already. It's too bad it's been a year
but should be released into spark 3:
https://issues.apache.org/jira/browse/SPARK-22231
On Fri, Nov 23, 2018 at 8:42 AM Colin Williams
wrote:
>
> Seems like it's worthy of filing a bug against withColumn
>
> On Wed,
Seems like it's worthy of filing a bug against withColumn
On Wed, Nov 21, 2018, 6:25 PM Colin Williams <
colin.williams.seat...@gmail.com wrote:
> Hello,
>
> I'm currently trying to update the schema for a dataframe with nested
> columns. I would either like to update the schema
Hello,
I'm currently trying to update the schema for a dataframe with nested
columns. I would either like to update the schema itself or cast the
column without having to explicitly select all the columns just to
cast one.
In regards to updating the schema it looks like I would probably need
to
Does anybody know how to use inferred schemas with structured
streaming:
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#schema-inference-and-partition-of-streaming-dataframesdatasets
I have some code like :
object StreamingApp {
def launch(config: Config,
I'm confused as to why Sparks Dataframe reader does not support reading
json or similar with microsecond timestamps to microseconds, but instead
reads into millis.
This seems strange when the TimestampType supports microseconds.
For example create a schema for a json object with a column of
ps://stackoverflow.com/a/25204589 but it's from an
older version of Spark.
I'm hoping maybe there is something more recent and more in-depth. I
don't mind references to books or otherwise.
Best,
Colin Williams
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
t;)
.load("src/test/resources/*.gz")
df1.show(80)
On Wed, Mar 28, 2018 at 5:10 PM, Colin Williams
<colin.williams.seat...@gmail.com> wrote:
> I've had more success exporting the schema toJson and importing that.
> Something like:
>
>
> val df1: DataFrame = session.r
t/resources/*.gz")
df1.show(80)
On Wed, Mar 28, 2018 at 3:25 PM, Colin Williams
<colin.williams.seat...@gmail.com> wrote:
> The to String representation look like where "someName" is unique:
>
> StructType(StructField("someName",StringType,true),
> Str
AME:struct<newValue:string,SOME_TABLE_NAME:string>,
SOME_TABLE_NAME:struct<newValue:string,SOME_TABLE_NAME:string>,SOME_TABLE_NAME:struct<newValue:string,
SOME_TABLE_NAME:string>,SOME_TABLE_NAME:struct<newValue:string,SOME_TABLE_NAME:string>,SOME_TABLE_NAME:
struct<newValue:s
I've been learning spark-sql and have been trying to export and import
some of the generated schemas to edit them. I've been writing the
schemas to strings like df1.schema.toString() and
df.schema.catalogString
But I've been having trouble loading the schemas created. Does anyone
know if it's
17 matches
Mail list logo