What would be the difference between doing cache.putAll(all rows) and separating them by affinity key+executing putAll inside a compute job. If I'm not mistaken, doing putAll should end up splitting those rows by affinity key in one of the servers, right? Is there a comparison of that?
On Fri, Feb 19, 2021 at 9:51 AM Taras Ledkov <[email protected]> wrote: > Hi Vladimir, > Did you try to use SQL command 'COPY FROM <csv_file>' via thin JDBC? > This command uses 'IgniteDataStreamer' to write data into cluster and > parse CSV on the server node. > > PS. AFAIK IgniteDataStreamer is one of the fastest ways to load data. > > Hi Denis, > > Data space is 3.7Gb according to MSSQL table properries > > Vladimir > > 9:47, 19 февраля 2021 г., Denis Magda <[email protected]> > <[email protected]>: > > Hello Vladimir, > > Good to hear from you! How much is that in gigabytes? > > - > Denis > > > On Thu, Feb 18, 2021 at 10:06 PM <[email protected]> wrote: > > Sep 2020 I've published the paper about Loading Large Datasets into Apache > Ignite by Using a Key-Value API (English [1] and Russian [2] version). The > approach described works in production, but shows inacceptable perfomance > for very large tables. > > The story continues, and yesterday I've finished the proof of concept for > very fast loading of very big table. The partitioned MSSQL table about 295 > million rows was loaded by the 4-node Ignite cluster in 3 min 35 sec. Each > node had executed its own SQL queries in parallel and then distributed the > loaded values across the other cluster nodes. > > Probably that result will be of interest for the community. > > Regards, > Vladimir Chernyi > > [1] > https://www.gridgain.com/resources/blog/how-fast-load-large-datasets-apache-ignite-using-key-value-api > [2] https://m.habr.com/ru/post/526708/ > > > > -- > Отправлено из мобильного приложения Яндекс.Почты > > -- > Taras Ledkov > Mail-To: [email protected] > >
