As a follow up...
BBEdit's Pattern Playground is a great help in constructing tedious grep
patterns like you'll need for your filtering and sorting needs. The really
tedious part is getting the field position(s) you want to filter or sort on
so you can modify that field's match pattern to conform to the desired
filter or sorting criteria.
For example... For your " Filter all lines that have ADR_CHK_KZ = 1" using
Text -> Process Lines Containing ... with the grep pattern:
\d{3};\w{3};[^;]*;[^;]*;\d{10};\w{2};\d{2};\d{5};[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;(1);[^;]*;[^\n]*
will do the trick. For filtering you don't need the group capturing on the
1 but it is useful with Pattern Playground to verify you're getting the
right field position and field contents matched.
For your "Sort the file by MSGNO, ADRC_COUNTRY, ADRC_REGION,
ADRC_POST_CODE1, ADRC_CITY1, ADRC_CITY2, ADRC_STREET and ADRC_HOUSE_NUM1"
using Text -> Sort Lines ... with a grep pattern of:
\d{3};\w{3};[^;]*;[^;]*;\d{10};(\w{2});(\d{2});(\d{5});([^;]*);[^;]*;([^;]*);([^;]*);([^;]*);[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;\d;([^;]*);[^\n]*
with "Specific sub-patterns" selected with \8\1\2\3\4\5\6\7 in the fill in
field will sort your example text using your desired field ordering.
On Tuesday, March 25, 2025 at 12:53:47 PM UTC-7 GP wrote:
> For filtering, look at Text -> Process Lines Containing ... and for
> sorting Text -> Sort Lines ... using grep patterns to identify what you
> want to match for filtering and what subpattern field or fields you want to
> sort ordered on.
>
> If the number of fields in your sample is representative of the real CSV
> files you're working with, it is going to be something of a pain in the
> rear coming up with the grep patterns needed to accomplish the desired
> filtering and sorting.
>
> On Tuesday, March 25, 2025 at 11:03:35 AM UTC-7 Vlad Ghitulescu wrote:
>
>> Hey,
>>
>>
>> I use BBEdit very often while working with big CSV-files (300 - 500 MB,
>> up to 4 million rows) looking like this:
>>
>> MANDT;BU;IDENTIFIER;OBJNR;ADRC_ADDRNUMBER;ADRC_COUNTRY;ADRC_REGION;ADRC_POST_CODE1;ADRC_CITY1;ADRC_CITY_EXT;ADRC_CITY2;ADRC_STREET;ADRC_HOUSE_NUM1;ADRC_HOUSE_NUM2;LOKAREF_COUNTRY;LOKAREF_REGION;LOKAREF_POST_CODE1;LOKAREF_CITY1;LOKAREF_CITY_CODE;LOKAREF_CITY_EXT;LOKAREF_CITY2;LOKAREF_CITYP_CODE;LOKAREF_STREET;LOKAREF_STRT_CODE;LOKAREF_HOUSE_NUM1;LOKAREF_HOUSE_NUM2;COUNTRY_KZ;REGION_KZ;POST_CODE1_KZ;CITY1_KZ;CITY_EXT_KZ;CITY2_KZ;STREET_KZ;ADR_CHK_KZ;MSGNO;MESSAGE
>>
>>
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723592;DE;09;86415;Mering;;Sankt
>>
>> Afra;Egerländer Straße;;;DE;09;86415;Mering;500000002795;, Schwab;Sankt
>> Afra;00000006;Egerländerstraße;910011919800;;;0;0;0;0;1;0;1;1;;
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723657;DE;09;85655;Aying;;Kaps;Kaps;;;DE;09;85653;Aying;500000002262;;Kaps;00000010;Kaps;700055566100;;;0;0;1;0;3;0;0;1;;
>>
>>
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723658;DE;09;83083;Riedering;;Patting;Patting;;;DE;09;83083;Riedering;500000002552;b
>>
>> Rosenheim, Oberbay;Patting;00000037;Pattinger
>> Straße;910003809300;;;0;0;0;0;1;0;1;1;;
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723674;DE;09;85655;Aying;;Großhelfendorf;Hirschbergstraße;;;DE;09;85653;Aying;500000002262;;Großhelfendorf;00000007;Hirschbergstraße;910002873200;;;0;0;1;0;3;0;0;1;;
>>
>>
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723878;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>
>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>
>> Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723908;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>
>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>
>> Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723918;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>
>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>
>> Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723956;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>
>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>
>> Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724554;DE;09;95131;Schwarzenbach
>>
>> a.Wald;;Schwarzenbach a
>> Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
>> a.Wald;500000011836;;Schwarzenbach
>> a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724593;DE;09;95131;Schwarzenbach
>>
>> a.Wald;;Schwarzenbach a
>> Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach
>> a.Wald;500000011836;;Schwarzenbach
>> a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
>>
>> Once in a while I’d like to filter or sort such huge files by one or more
>> columns, like:
>>
>> 1. Filter all lines that have ADR_CHK_KZ = 1 or
>> 2. Sort the file by MSGNO, ADRC_COUNTRY, ADRC_REGION, ADRC_POST_CODE1,
>> ADRC_CITY1, ADRC_CITY2, ADRC_STREET and ADRC_HOUSE_NUM1.
>>
>> Is there a way to do this sort of tasks with BBEdit?
>>
>> Thanks!
>>
>>
>> Regards,
>> Vlad
>>
>>
>>
>>
--
This is the BBEdit Talk public discussion group. If you have a feature request
or believe that the application isn't working correctly, please email
"[email protected]" rather than posting here. Follow @bbedit on Mastodon:
<https://mastodon.social/@bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/bbedit/50130484-14eb-4298-b762-800f88b2c66en%40googlegroups.com.