hi,
We are building an analytics dashboard. Data will be updated every 5
minutes for now and eventually every 1 minute, maybe more frequent. The
amount of data coming is not huge, per customer maybe 30 records per minute
although we could have 500 customers. Is streaming correct for this
I nstead
at 4:34 PM, Gordon Benjamin
gordon.benjami...@gmail.com
javascript:_e(%7B%7D,'cvml','gordon.benjami...@gmail.com'); wrote:
hi,
We are building an analytics dashboard. Data will be updated every 5
minutes for now and eventually every 1 minute, maybe more frequent. The
amount of data coming
will be updated with the new
data. And yes, the end use won't feel anything while you do the
coalesce/repartition and all but after that your dashboards will be
refreshed with new data.
Thanks
Best Regards
On Mon, Nov 24, 2014 at 4:54 PM, Gordon Benjamin
gordon.benjami...@gmail.com
javascript:_e(%7B
Hi,
We are seeing bad performance as we incrementally load data. Here is the
config
Spark standalone cluster
spark01 (spark master, shark, hadoop namenode): 15GB RAM, 4vCPU's
spark02 (spark worker, hadoop datanode): 15GB RAM, 8vCPU's
spark03 (spark worker): 15GB RAM, 8vCPU's
spark04 (spark
from ..._incremental
Perhaps this helps understand our issue
On Thursday, November 20, 2014, Gordon Benjamin gordon.benjami...@gmail.com
wrote:
Hi,
We are seeing bad performance as we incrementally load data. Here is the
config
Spark standalone cluster
spark01 (spark master, shark
hey,
Can anyone tell me how to debug a sql execution? Perhaps so it can show
what the query is doing and how long it takes at each point?
Hi All,
I'm using Spark/Shark as the foundation for some reporting that I'm doing
and have a customers table with approximately 3 million rows that I've
cached in memory.
I've also created a partitioned table that I've also cached in memory on a
per day basis
FROM
customers_cached
INSERT