Hi,
I ran with fewer columns(6). No much deviation between LINQ and CompiledLINQ
BenchmarkDotNet v0.14.0, Windows 10
(10.0.19044.2130/21H2/November2021Update)
Intel Core i5-8500 CPU 3.00GHz (Coffee Lake), 1 CPU, 6 logical and 6
physical cores
.NET SDK 8.0.101
[Host] : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
| Method | Mean | Error | StdDev | Gen0 | Gen1 |
Gen2 | Allocated |
|-------------
|---------:|---------:|---------:|----------:|---------:|---------:|----------:|
| Linq | 188.6 ms | 15.08 ms | 43.26 ms | 1000.0000 | - |
- | 126.01 MB |
| LinqCompiled | 166.8 ms | 3.24 ms | 5.23 ms | 666.6667 | 666.6667 |
333.3333 | 121.18 MB |
Thanks & Regards,
Charlin
On Mon, 25 Nov 2024 at 18:40, Pavel Tupitsyn <[email protected]> wrote:
> Your query selects the entire row, which is quite big, this takes extra
> time.
> If possible, try selecting a subset of columns.
>
> On Mon, Nov 25, 2024 at 2:40 PM Charlin S <[email protected]> wrote:
>
>> Hi,
>> Yes, It's approximately 300MB.
>>
>> Thanks
>> Charlin
>>
>>
>> On Mon, 25 Nov 2024 at 12:52, Pavel Tupitsyn <[email protected]>
>> wrote:
>>
>>> > How big is one row on average, in bytes?
>>> > Approximately 300MB size
>>>
>>> 300 Megabytes, is that correct? This explains the measurements, I think.
>>>
>>> On Fri, Nov 22, 2024 at 4:27 PM Charlin S <[email protected]>
>>> wrote:
>>>
>>>> Hi
>>>> records count => more than 160,0000
>>>> No of columns => 87 Columns (80 columns data null only)
>>>> How big is one row on average, in bytes? Approximately 300MB size
>>>>
>>>> New stats after removing 80 columns with same no of records
>>>> BenchmarkDotNet v0.14.0, Windows 10
>>>> (10.0.19044.2130/21H2/November2021Update)
>>>> Intel Core i5-8500 CPU 3.00GHz (Coffee Lake), 1 CPU, 6 logical and 6
>>>> physical cores
>>>> .NET SDK 8.0.101
>>>> [Host] : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>> DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>>
>>>>
>>>> | Method | Mean | Error | StdDev | Gen0 | Gen1
>>>> | Gen2 | Allocated |
>>>> |-------------
>>>> |---------:|---------:|---------:|----------:|---------:|---------:|----------:|
>>>> | Linq | 188.6 ms | 15.08 ms | 43.26 ms | 1000.0000 | -
>>>> | - | 126.01 MB |
>>>> | LinqCompiled | 166.8 ms | 3.24 ms | 5.23 ms | 666.6667 | 666.6667
>>>> | 333.3333 | 121.18 MB |
>>>>
>>>> Thanks
>>>> Charlin
>>>>
>>>> On Fri, 22 Nov 2024 at 00:07, Pavel Tupitsyn <[email protected]>
>>>> wrote:
>>>>
>>>>> So how is the real data different from the test data?
>>>>> - How many rows are there?
>>>>> - How many columns?
>>>>> - How big is one row on average, in bytes?
>>>>>
>>>>> On Thu, Nov 21, 2024 at 7:08 PM Charlin S <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>> The summary with real data :-
>>>>>> BenchmarkDotNet v0.14.0, Windows 10
>>>>>> (10.0.19044.2130/21H2/November2021Update)
>>>>>> Intel Core i5-8500 CPU 3.00GHz (Coffee Lake), 1 CPU, 6 logical and 6
>>>>>> physical cores
>>>>>> .NET SDK 8.0.101
>>>>>> [Host] : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>>>> DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>>>>
>>>>>>
>>>>>> | Method | Mean | Error | StdDev | Median | Gen0
>>>>>> | Gen1 | Gen2 | Allocated |
>>>>>> |-------------
>>>>>> |---------:|---------:|---------:|---------:|----------:|----------:|----------:|----------:|
>>>>>> | Linq | 530.0 ms | 17.28 ms | 48.46 ms | 512.4 ms |
>>>>>> 7000.0000 | 3000.0000 | 1000.0000 | 999.57 MB |
>>>>>> | LinqCompiled | 543.3 ms | 21.48 ms | 59.53 ms | 525.3 ms |
>>>>>> 6000.0000 | 2000.0000 | 1000.0000 | 994.65 MB |
>>>>>>
>>>>>> // * Hints *
>>>>>> Outliers
>>>>>> Benchmarks.Linq: Default -> 9 outliers were removed (722.82
>>>>>> ms..1.42 s)
>>>>>> Benchmarks.LinqCompiled: Default -> 11 outliers were removed
>>>>>> (767.85 ms..2.67 s)
>>>>>>
>>>>>> Thanks
>>>>>> Charlin
>>>>>>
>>>>>> On Thu, 21 Nov 2024 at 16:01, Pavel Tupitsyn <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Can you adjust this benchmark project to have the data and query
>>>>>>> more like your real project?
>>>>>>> Currently we have a similar number of rows and the query performs
>>>>>>> very well, let's understand what is different.
>>>>>>>
>>>>>>> On Thu, Nov 21, 2024 at 11:43 AM Charlin S <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> Thank you for your email. I apologize for the late reply.
>>>>>>>> I ran the sample and summaries are:-
>>>>>>>>
>>>>>>>> BenchmarkDotNet v0.14.0, Windows 10
>>>>>>>> (10.0.19044.2130/21H2/November2021Update)
>>>>>>>> Intel Core i5-8500 CPU 3.00GHz (Coffee Lake), 1 CPU, 6 logical and
>>>>>>>> 6 physical cores
>>>>>>>> .NET SDK 8.0.101
>>>>>>>> [Host] : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>>>>>> DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>>>>>>
>>>>>>>>
>>>>>>>> | Method | Mean | Error | StdDev |
>>>>>>>> |--------------------|--------------:|----------:|---------------:|
>>>>>>>> | Linq | 94.60 us | 1.515 us | 1.417 us |
>>>>>>>> | LinqCompiled | 16.50 us | 0.328 us | 0.449 us |
>>>>>>>>
>>>>>>>> LINQCompiled much faster than LINQ in sample code but with real
>>>>>>>> data it's not much different.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Charlin
>>>>>>>>
>>>>>>>> On Thu, 14 Nov 2024 at 20:30, Pavel Tupitsyn <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> 1. Run the code in Release configuration
>>>>>>>>> 2. Try Compiled Query
>>>>>>>>> https://ignite.apache.org/docs/latest/net-specific/net-linq#compiled-queries
>>>>>>>>>
>>>>>>>>> My benchmark on 550_000 rows with 1 matching row shows 28
>>>>>>>>> microseconds for regular query and 6 microseconds for compiled query:
>>>>>>>>> https://gist.github.com/ptupitsyn/189c8164061bace8d975b2ec39045ca4
>>>>>>>>>
>>>>>>>>> 200 ms is a lot. Can you run my code and share the results?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Nov 14, 2024 at 1:46 PM Charlin S <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> I was measuring performance after 5 times. Now I have run more
>>>>>>>>>> than 100 times and var queryRes = query.ToList(); taken 202
>>>>>>>>>> milliseconds
>>>>>>>>>> as minimum.
>>>>>>>>>> Will be appreciated if I get a solution for getting the result in
>>>>>>>>>> 10 - 50 milliseconds.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Charlin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, 14 Nov 2024 at 15:35, Pavel Tupitsyn <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> How do you measure? Do you perform a warm up before measurement?
>>>>>>>>>>> Try running the query 100 times before the measurement. Or,
>>>>>>>>>>> ideally, use BenchmarkDotNet to get accurate results.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Nov 14, 2024 at 11:51 AM Charlin S <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> Thank you for your response.and sample.
>>>>>>>>>>>> My cache instance was created as TestIcache=_ignite.
>>>>>>>>>>>> GetOrCreateCache<string, TestModel>("TestModel"); so index was
>>>>>>>>>>>> not applied.
>>>>>>>>>>>> now I am creating cache instance like below and index applied
>>>>>>>>>>>> var cacheCfgTestModel = new CacheConfiguration("TestModel",
>>>>>>>>>>>> new QueryEntity(typeof(TestModel)));
>>>>>>>>>>>> TestIcache=_ignite.GetOrCreateCache<string, TestModel
>>>>>>>>>>>> >(cacheCfgTestModel);
>>>>>>>>>>>>
>>>>>>>>>>>> SQL:
>>>>>>>>>>>> select _T0._KEY, _T0._VAL from "TestModel".TESTMODEL as _T0
>>>>>>>>>>>> where (_T0.COUNTRYCODE IS NOT DISTINCT FROM ?)
>>>>>>>>>>>>
>>>>>>>>>>>> EXPLANATION:
>>>>>>>>>>>> SELECT
>>>>>>>>>>>> _T0__Z0._KEY AS __C0_0,
>>>>>>>>>>>> _T0__Z0._VAL AS __C0_1
>>>>>>>>>>>> FROM "TestModel".TESTMODEL _T0__Z0
>>>>>>>>>>>> /* "TestModel".TESTMODEL_COUNTRYCODE_ASC_IDX: COUNTRYCODE
>>>>>>>>>>>> IS ?1 */
>>>>>>>>>>>> WHERE _T0__Z0.COUNTRYCODE IS ?1
>>>>>>>>>>>> SELECT
>>>>>>>>>>>> __C0_0 AS _KEY,
>>>>>>>>>>>> __C0_1 AS _VAL
>>>>>>>>>>>> FROM PUBLIC.__T0
>>>>>>>>>>>> /* "TestModel"."merge_scan" */
>>>>>>>>>>>>
>>>>>>>>>>>> Total records:27713 = > 27713
>>>>>>>>>>>> Time taken for Query:2 milliseconds -> var query = TestIcache.
>>>>>>>>>>>> AsCacheQueryable().Where(x => x.Value.CountryCode ==
>>>>>>>>>>>> CountryCode); Time taken for QueryResult:313 milliseconds -> v
>>>>>>>>>>>> ar queryRes = query.ToList();
>>>>>>>>>>>> Total Time taken:316 milliseconds
>>>>>>>>>>>> Why var queryRes = query.ToList(); taking 313 milliseconds
>>>>>>>>>>>> even if the result having only one record is taking nearly 300
>>>>>>>>>>>> milliseconds.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Charlin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, 13 Nov 2024 at 22:36, Pavel Tupitsyn <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> - This shows a full scan, the index is not being used.
>>>>>>>>>>>>> - There is no parametrization, I'm not sure why. Which Ignite
>>>>>>>>>>>>> version do you use?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've prepared a simple proof of concept [1] which shows that
>>>>>>>>>>>>> an index is used in this situation. Generated SQL is different:
>>>>>>>>>>>>>
>>>>>>>>>>>>> >> SQL:
>>>>>>>>>>>>> select _T0._KEY, _T0._VAL from "c".TESTMODEL as _T0 where
>>>>>>>>>>>>> (_T0.COUNTRYCODE IS NOT DISTINCT FROM ?)
>>>>>>>>>>>>>
>>>>>>>>>>>>> >> EXPLAIN:
>>>>>>>>>>>>> SELECT
>>>>>>>>>>>>> _T0__Z0._KEY AS __C0_0,
>>>>>>>>>>>>> _T0__Z0._VAL AS __C0_1
>>>>>>>>>>>>> FROM "c".TESTMODEL _T0__Z0
>>>>>>>>>>>>> /* "c".TESTMODEL_COUNTRYCODE_ASC_IDX: COUNTRYCODE IS ?1 */
>>>>>>>>>>>>> WHERE _T0__Z0.COUNTRYCODE IS ?1
>>>>>>>>>>>>> SELECT
>>>>>>>>>>>>> __C0_0 AS _KEY,
>>>>>>>>>>>>> __C0_1 AS _VAL
>>>>>>>>>>>>> FROM PUBLIC.__T0
>>>>>>>>>>>>> /* "c"."merge_scan" */
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://gist.github.com/ptupitsyn/882b9b9e5e13c82fcf96f83fd53b2777
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Nov 13, 2024 at 6:41 PM Charlin S <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> 1. Use "==" instead of "string.Equals", looks like the
>>>>>>>>>>>>>> provider does not like string.Equals.
>>>>>>>>>>>>>> changed to ==
>>>>>>>>>>>>>> 2. Share the generated SQL
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> SELECT Field1,Field2 FROM TestModel WHERE COUNTRYCODE='AU'
>>>>>>>>>>>>>> 3. Share the result of EXPLAIN for that SQL
>>>>>>>>>>>>>> plan=SELECT
>>>>>>>>>>>>>> __Z0.FIELD1 AS __C0_0,
>>>>>>>>>>>>>> __Z0.FIELD2 AS __C0_1
>>>>>>>>>>>>>> FROM "TestModel".TESTMODEL __Z0
>>>>>>>>>>>>>> /* "TestModel".TESTMODEL.__SCAN_ */
>>>>>>>>>>>>>> /* scanCount: 1681786 */
>>>>>>>>>>>>>> WHERE (__Z0.COUNTRYCODE = 'AU')
>>>>>>>>>>>>>> 4. Is there an index on CountryCode?
>>>>>>>>>>>>>> Yes, [QuerySqlField(IsIndexed = true)]
>>>>>>>>>>>>>> public string CountryCode { get; set; }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Charlin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, 13 Nov 2024 at 15:06, Pavel Tupitsyn <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Use "==" instead of "string.Equals", looks like the
>>>>>>>>>>>>>>> provider does not like string.Equals
>>>>>>>>>>>>>>> 2. Share the generated SQL
>>>>>>>>>>>>>>> 3. Share the result of EXPLAIN for that SQL
>>>>>>>>>>>>>>> 4. Is there an index on CountryCode?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Nov 13, 2024 at 9:16 AM Charlin S <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>> Thanks for your response.
>>>>>>>>>>>>>>>> I have tried different ways but the result is the same.
>>>>>>>>>>>>>>>> my cache records count is above 160,0000
>>>>>>>>>>>>>>>> var watchLINQQuery = System.Diagnostics.Stopwatch.StartNew();
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> var tmp= TestIcache.AsEnumerable().Where(tc=> string
>>>>>>>>>>>>>>>> .Equals(tc.Value.CountryCode, CountryCode)).Select(tc => tc
>>>>>>>>>>>>>>>> .Value);
>>>>>>>>>>>>>>>> watchLINQQuery.Stop(); //0 or 1 Milliseconds
>>>>>>>>>>>>>>>> var watchIQueryableToArray = System.Diagnostics.
>>>>>>>>>>>>>>>> Stopwatch.StartNew();
>>>>>>>>>>>>>>>> var result = tmp.ToArray(); // 12354 milliseconds
>>>>>>>>>>>>>>>> taken
>>>>>>>>>>>>>>>> watchIQueryableToArray.Stop();
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> var result = tmp.ToArray(); taking similar time even if my
>>>>>>>>>>>>>>>> query result returns 1 or 2 records. Please suggest to me how
>>>>>>>>>>>>>>>> to improve
>>>>>>>>>>>>>>>> this query performance.
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Charlin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, 5 Nov 2024 at 19:01, Pavel Tupitsyn <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. Check the generated SQL
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // Cast to ICacheQueryable
>>>>>>>>>>>>>>>>> var cacheQueryable = (ICacheQueryable) query;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // Get resulting fields query
>>>>>>>>>>>>>>>>> SqlFieldsQuery fieldsQuery =
>>>>>>>>>>>>>>>>> cacheQueryable.GetFieldsQuery();
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> // Examine generated SQL
>>>>>>>>>>>>>>>>> Console.WriteLine(fieldsQuery.Sql);
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2. Try EXPLAIN and other suggestions from
>>>>>>>>>>>>>>>>> https://ignite.apache.org/docs/latest/SQL/sql-tuning
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3. Is there an index on CountryCode?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 4. Case-insensitive comparison might be inefficient. A
>>>>>>>>>>>>>>>>> better approach is to store lower/uppercase value in cache,
>>>>>>>>>>>>>>>>> and then using
>>>>>>>>>>>>>>>>> lower/upper criteria for search
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Nov 5, 2024 at 1:08 PM Charlin S <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Pavel,
>>>>>>>>>>>>>>>>>> Thanks for your email. it reduces to 9 seconds after
>>>>>>>>>>>>>>>>>> removing AsParallel. Please let me know if there are any
>>>>>>>>>>>>>>>>>> more options to
>>>>>>>>>>>>>>>>>> get good performance.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>> Charlin
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, 5 Nov 2024 at 13:31, Pavel Tupitsyn <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi, "AsParallel" is the problem, it causes the entire
>>>>>>>>>>>>>>>>>>> data set to be loaded locally before filtering.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Remove it so that the LINQ expression can be translated
>>>>>>>>>>>>>>>>>>> into Ignite SQL and executed more efficiently.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://ignite.apache.org/docs/latest/net-specific/net-linq
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Nov 5, 2024 at 8:58 AM Charlin S <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am trying Ignit.Net LINQ for the first time and
>>>>>>>>>>>>>>>>>>>> seeing very slowness with my linq query taking 13-15
>>>>>>>>>>>>>>>>>>>> seconds. Test model
>>>>>>>>>>>>>>>>>>>> having 550,000 records
>>>>>>>>>>>>>>>>>>>> my query as below
>>>>>>>>>>>>>>>>>>>> TestModel having index for CountryCode field.
>>>>>>>>>>>>>>>>>>>> ICache<string, TestModel> cache =
>>>>>>>>>>>>>>>>>>>> ignite.GetCache<string, TestModel>(CacheName);
>>>>>>>>>>>>>>>>>>>> IQueryable<ICacheEntry<string, TestModel>> i
>>>>>>>>>>>>>>>>>>>> gniteQuerable = cache.AsCacheQueryable();
>>>>>>>>>>>>>>>>>>>> igniteQuerable.AsParallel()
>>>>>>>>>>>>>>>>>>>> .Where(x=>string.Equals(x.Value.CountryCode,
>>>>>>>>>>>>>>>>>>>> criteria.CountryCode, StringComparison.
>>>>>>>>>>>>>>>>>>>> CurrentCultureIgnoreCase))
>>>>>>>>>>>>>>>>>>>> .Select(x => x.Key).ToList();
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>> Charlin
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>