Re: Spark batch with Druid

Rajiv Mordani Sat, 09 Feb 2019 17:02:24 -0800

Thanks Julian,
        See some questions in-line:

On 2/6/19, 3:01 PM, "Julian Jaffe" <jja...@pinterest.com.INVALID> wrote:


    I think this question is going the other way (e.g. how to read data into
    Spark, as opposed to into Druid). For that, the quickest and dirtiest
    approach is probably to use Spark's json support to parse a Druid response.

[Rajiv] Can you please expand more here?

    You may also be able to repurpose some code from
    
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSparklineData%2Fspark-druid-olap&amp;data=02%7C01%7Crmordani%40vmware.com%7Cdac469891e6143eb417208d68c87161c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C636850909153478697&amp;sdata=YwEJLohvwCI%2FGnjtlH%2BP6BgnLLketOJnhp8IGZey2d4%3D&amp;reserved=0,
 but I don't think
    there's any official guidance on this.


    
    On Wed, Feb 6, 2019 at 2:21 PM Gian Merlino <g...@apache.org> wrote:
    
    > Hey Rajiv,
    >
    > There's an unofficial Druid/Spark adapter at:
    > 
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmetamx%2Fdruid-spark-batch&amp;data=02%7C01%7Crmordani%40vmware.com%7Cdac469891e6143eb417208d68c87161c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C636850909153478697&amp;sdata=WnaiBpvr%2B4%2BrkFGZPhcZJ%2BpbrxkkzyAv8vi7cql5GZA%3D&amp;reserved=0.
 If you want to stick to
    > official things, then the best approach would be to use Spark to write 
data
    > to HDFS or S3 and then ingest it into Druid using Druid's Hadoop-based or
    > native batch ingestion. (Or even write it to Kafka using Spark Streaming
    > and ingest from Kafka into Druid using Druid's Kafka indexing service.)
    >
    > On Wed, Feb 6, 2019 at 12:04 PM Rajiv Mordani <rmord...@vmware.com.invalid
    > >
    > wrote:
    >
    > > Is there a best practice for how to load data from druid to use in a
    > spark
    > > batch job? I asked this question on the user alias but got no response
    > > hence reposting here.
    > >
    > >
    > >   *   Rajiv
    > >
    >

Re: Spark batch with Druid

Reply via email to