Christian writes really good tools.  Sparkling is no exception.
I have yet to use it in production myself however, since I haven't had the
need to use Clojure directly to solve any "data aggregation" problems.
Spark and other tools do that well enough, naturally.

As far as using a tool/programming language to solve "data integration"
problems in large enterprise environments, I will ALWAYS use Open Source
tools for that purpose.  Clojure is no exception.  But I do tend to choose
open source hammers to drive nails.  Sometimes Clojure is missing the
handle on its hammer, as we have all experienced, but that's on us since WE
have the power to make Clojure better.  But often TIME is what we lack to
build better API's, libraries, tools for Clojure expansion.

The Apache ecosystem offers many tools & libraries for "big data" and "data
integration"  which I often turn to first because I lack TIME for building
(long tail), but have enough TIME for learning new things (shorter tail
that helps the long tail).
https://projects.apache.org/projects.html?category

Thad
https://www.linkedin.com/in/thadguidry/


On Thu, Jul 4, 2019 at 12:37 PM Chris Nuernberger <ch...@techascent.com>
wrote:

> Thad,
>
> You approach seems very promising to me for a lot of jobs.  Spark runs on
> top of many things.
>
> As far as a clojure layer on top, what do you think about sparkling
> <http://gorillalabs.github.io/sparkling/>?
>
> On Thu, Jul 4, 2019 at 8:43 AM Thad Guidry <thadgui...@gmail.com> wrote:
>
>> "Batch" - doing things in chunks
>> "Processing" - THE WORLD :-)  because it means so many different things
>> to so many folks (including your boss)
>>
>> Without a doubt, you will love Apache Spark for your batch processing and
>> writing Spark Programs to conquer any World you are building.
>> Spend time to install Spark standalone deploy and then use its powerful
>> Spark Shell <https://spark.apache.org/docs/latest/quick-start.html> (the
>> feeling of Clojure REPL  !!)
>> If you just want to jump in to a public cluster and Try Spark, then I
>> would suggest Databricks <https://databricks.com/spark/about>.
>> Spend time reading the features under Libraries drop-down menu on Apache
>> Spark website <https://spark.apache.org/>.
>>
>> You might even be encouraged enough to write an official API in Clojure
>> for Apache Spark within a year!  (win-win)
>>
>> One note of caution if you are building something for long term, you will
>> eventually have a need for data versioning, ACID transactions, schema
>> evolution, for this I use Delta Lake <https://delta.io/> (not Datomic)
>> since its fully compatible with Spark
>>
>> Best of luck!
>> Thad
>> https://www.linkedin.com/in/thadguidry/
>>
>>
>> On Thu, Jul 4, 2019 at 3:22 AM orazio <orazio.pist...@gmail.com> wrote:
>>
>>> Hi @atdixon and Thad, thanks for your help.
>>>
>>> I provide more details about my project
>>> My big data layer  is inspired by Lambda architecture. The pipeline
>>> include following layers and related tool choosed to address the issue:
>>> - *Nifi* for *data ingestion*, and publisinh data/message on  kafka
>>> topic.
>>> - *Kafka* as *message broker* that with kafka connect, allow me to
>>> store data in mongodb ( with mongodb sink and 1 day retention period ) and
>>> HDFS (hdfk sink with 1 year retention period)
>>> - *Real time processing* with *mongoDB* using it's built-in QueryEngine
>>> taht provides extensive Querying, Filtering, and Searching abilities.
>>> - *Batch processing* of data stored on HDFS, that performs data
>>> aggregation and store result on a HBase Table. *?* The question is :
>>> Which tool do you suggest to use for data processing sotred on HDFS ?
>>> - *Serving Layer* with *HBase/Phoneix* to store and allow access to
>>> batch view.
>>>
>>> Now i'm invoking your help to choose *the most appropriate tool to
>>> execute batch jobs (map reduce)* which will have to aggregate data.
>>> Natahn Marz suggests Clojure/Cascalog. Do you know other excellent
>>> clojure/Hadoop work in the community, about data processing?
>>> if you know some particularly appropriate tools, I could also consider
>>> other work/library outside the clojure community.
>>>
>>> Thanks
>>>
>>>
>>>
>>> Il giorno mercoledì 3 luglio 2019 14:56:09 UTC+2, Thad Guidry ha scritto:
>>>>
>>>> "The best code is never written"
>>>>
>>>> https://zeppelin.apache.org/
>>>> https://nifi.apache.org/
>>>>
>>>> Thad
>>>> https://www.linkedin.com/in/thadguidry/
>>>>
>>>>
>>>> On Tue, Jul 2, 2019 at 11:07 AM orazio <orazio...@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I'm newbie on Clojure/Big Data, and i'm starting with hadoop.
>>>>> I have installed Hortonworks HDP 3.1
>>>>> I have to design a Big Data Layer that ingests large iot datasets and
>>>>> social media datasets, process data with MapReduce job and produce
>>>>> aggregation to store on HBASE tables.
>>>>>
>>>>> For now, my focus is addressed on data processing issue. My question
>>>>> is: Is Clojure a good choice for distributed data processing on hadoop ?
>>>>> I found Cascalog as fully-featured data processing and querying
>>>>> library for Clojure or Java. But are there any active maintainers, for 
>>>>> this
>>>>> library ?
>>>>> Do you know other excellent clojure/Hadoop work in the community,
>>>>> abaout data processing?
>>>>>
>>>>> I would appreciate some help.
>>>>>
>>>>> Orazio
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Clojure" group.
>>>>> To post to this group, send email to clo...@googlegroups.com
>>>>> Note that posts from new members are moderated - please be patient
>>>>> with your first post.
>>>>> To unsubscribe from this group, send email to
>>>>> clo...@googlegroups.com
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/clojure?hl=en
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Clojure" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to clo...@googlegroups.com.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/clojure/fbc26ffb-5f00-46a7-bf33-7a899f1ffead%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/clojure/fbc26ffb-5f00-46a7-bf33-7a899f1ffead%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clojure@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+unsubscr...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to clojure+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/clojure/25a56148-9231-4a1b-8bba-8cb79776ba6b%40googlegroups.com
>>> <https://groups.google.com/d/msgid/clojure/25a56148-9231-4a1b-8bba-8cb79776ba6b%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clojure@googlegroups.com
>> Note that posts from new members are moderated - please be patient with
>> your first post.
>> To unsubscribe from this group, send email to
>> clojure+unsubscr...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to clojure+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/clojure/CAChbWaP7jdLY0DRBwMAu2jWi_YbV2xqf2Y_az00Jb8U_ctv%3DFw%40mail.gmail.com
>> <https://groups.google.com/d/msgid/clojure/CAChbWaP7jdLY0DRBwMAu2jWi_YbV2xqf2Y_az00Jb8U_ctv%3DFw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/clojure/CADbpEJtRLqEpD5nzq5eUwUqXYtE7na87j043LqnqwdUaOWjfSA%40mail.gmail.com
> <https://groups.google.com/d/msgid/clojure/CADbpEJtRLqEpD5nzq5eUwUqXYtE7na87j043LqnqwdUaOWjfSA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/clojure/CAChbWaNzPoCmYtK4iunpgazyLPFPn83rYzdVP-MQeZVsszr7fw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to