I’m not sure what the best solution is but I created a ticket here:

https://share.polymail.io/v1/z/b/NTkwYTM4NTgzMzIy/ptA3bo_BAIo9IWGz0OXooezKKqlB7FL6rPYuPfHCNnGvRz-yUxCoYMxiNmygRARAMgtzeZ4jz5UxoPQtQlYe-nLRtaBMkhFwn2t7rMLPwtJuDIDVDy0E_azvjPZDVrjRLGkL40kqM-qpxMg6BgBzUgcrawJMQ7dnfV93mVHjjMxqbM4r9K-k5eXP9dX4T5JgwSKXPpVopDZn19r-bP671LA_2MU4-_Vh
http://www.placeiq.com/ http://www.placeiq.com/ http://www.placeiq.com/

Paul Brenner

https://twitter.com/placeiq https://twitter.com/placeiq 
https://twitter.com/placeiq
https://www.facebook.com/PlaceIQ https://www.facebook.com/PlaceIQ
https://www.linkedin.com/company/placeiq 
https://www.linkedin.com/company/placeiq

DATA SCIENTIST

(217) 390-3033 

 

http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP
 
http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/
 
http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/ 
http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/

On Wed, May 03, 2017 at 4:01 AM Rick Moritz

<
mailto:Rick Moritz <rah...@gmail.com>
> wrote:

<![CDATA[a, pre, code, a:link, body { word-wrap: break-word !important; }]]>

I think whether this is an issue or not, depends a lot on how you use Zeppelin, 
and what tools you need to integrate with. Sadly Excel is still around as a 
data processing tool, and many people who I introduce to Zeppelin are quite 
proficient with it, hence the desire to export to csv in a trivial manner --  
or merely the presence of the "download CSV"-button incites them to expect it 
to work for reasonably sized data (i.e. up to around 10^6 rows).

I do prefer Ruslan's idea, but I think Zeppelin should include something 
similar out of the box. The key requirement should be that the data doesn't 
have to travel through the notebook interface, but rather is made available in 
a temporary folder and then served via a download link. The downside to this 
approach is, that ideally you'd want this kind of operation to be interpreter 
agnostic. In that case every interpreter would need to offer an interface which 
allows to collect the data to a local-to-zeppelin temporary folder.

Nonetheless, to turn Zeppelin into the serve-it-all solution that it could be, 
I do believe that "fixing" the csv-export is important. I'd definitely vote for 
a Jira advancing this issue.

On Tue, May 2, 2017 at 9:33 PM, Kevin Niemann

<
mailto:kevin.niem...@gmail.com
>

wrote:

We came across this issue as well, Zeppelin csv export is using the data URI 
scheme which is base64 encoding all the rows into a single string, Chrome seems 
to crash with over a few thousand rows, but Firefox has been able to handle 
over 100k for me. However, the Zeppelin notebook itself becomes slow at that 
point. I would also like better support for the ability to export a large set 
of rows, perhaps another tool is more preferred?

On Tue, May 2, 2017 at 10:00 AM, Ruslan Dautkhanov

<
mailto:dautkha...@gmail.com
>

wrote:

Good idea to introduce in Zeppelin a way to download full datasets without 

actually visualizing them.

Not sure if this helps, we taught our users to use %sh hadoop fs -getmerge 
/hadoop/path/dir/ /some/nfs/mount/

for large files (they sometimes have to download datasets with millions of 
records).

They run Zeppelin on edge nodes that have NFS mounts to a drop zone.

ps. Hue has a limit too, by default 100k rows
https://github.com/cloudera/hue/blob/release-3.12.0/desktop/conf.dist/hue.ini#L905
 

Not sure how much it scales up.

--

Ruslan Dautkhanov

On Tue, May 2, 2017 at 10:41 AM, Paul Brenner

<
mailto:pbren...@placeiq.com
>

wrote:

There are limits to how much data the download to csv button will download 
(1.5MB? 3500 rows?) which limit zeppelin’s usefulness for our BI teams. This 
limit comes up far before we run into issues with showing too many rows of data 
in zeppelin.

Unfortunately (fortunately?) Hue is the other tool the BI team has been using 
and there they have no problem downloading much larger datasets to csv. This is 
definitely not a requirement I’ve ever run into in the way I use zeppelin since 
I would just use spark to write the data out. However, the BI team is not 
allowed to run spark jobs (they use hive via jdbc) so that download to csv 
button is pretty important to them. 

Would it be possible to significantly increase the limit? Even better would it 
be possible to download more data than is shown? I assume this is the type of 
thing I would need to open a ticket for, but I wanted to ask here first.

http://www.placeiq.com/ http://www.placeiq.com/ http://www.placeiq.com/

Paul Brenner

https://twitter.com/placeiq https://twitter.com/placeiq 
https://twitter.com/placeiq
https://www.facebook.com/PlaceIQ https://www.facebook.com/PlaceIQ
https://www.linkedin.com/company/placeiq 
https://www.linkedin.com/company/placeiq

DATA SCIENTIST

tel:(217)%20390-3033
 

 

http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/
 
http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/
 
http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP
 
http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/
 
http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/ 
http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/

Reply via email to