Re: Bug#881692: command-not-found: I re-wrote command-not-found

2017-11-26 Thread Shawn Landden
On Mon, Nov 13, 2017 at 11:50 PM, Julian Andres Klode <j...@debian.org> wrote:
> (forwarding this to ubuntu-devel-discuss and Zygmunt)
>
> On Mon, Nov 13, 2017 at 10:33:39PM -0800, Shawn Landden wrote:
>> Package: command-not-found
>> Severity: wishlist
>>
>> I re-wrote command-not-found to get rid of the python dependancy, and
>> to reduce the database size, as to reduce memory usage.
>>
>> https://github.com/shawnl/command-not-found
>>
>> I was preparing to upload it to mentors as command-not-found-ng
>
> I also rewrote it years ago, but using the same database format,
> just in C. It was a lot faster. I don't understand the memory usage
> bit - it should not matter how large the database is, it's memory
> mapped, and not read into memory, as such memory usage should be
> roughly constant.
>
> Questions/Comments for your approach:
>
> * Did you test your format on a slow HDD with caches dropped? It
>   must not be slower than the Python one (that one is way too slow
>   already) - I did, it seems to be faster (0.4 vs 0.68 seconds)
>   - I believe the database-based C rewrite was even much faster,
>   though.
I switched it to mmap() and am now getting 0.27-0.45 with caches
dropped, even after adding translations. It is 100% C and sh. (same
postinst and postrm)

Ping.

-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Bug#881692: command-not-found: I re-wrote command-not-found

2017-11-17 Thread Shawn Landden
>
>
> Ruby is just a major no go.

Re-written in C.

And in the future, what about Lua? It is only 300KB.

> At that system level, the best choices
> are Perl, Shell, and C++. Maybe Python (on Ubuntu it's in ubuntu-minimal,
> but in Debian it's only used by standard priority and less, perl on the
> other hand is required and essential). Ruby has the lowest priority
> - optional.
>
> --
> Debian Developer - deb.li/jak | jak-linux.org - free software dev
> Ubuntu Core Developer  de, en speaker
>
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Bug#881692: command-not-found: I re-wrote command-not-found

2017-11-16 Thread Shawn Landden
On Thu, Nov 16, 2017 at 11:39 PM, Xen  wrote:

> Julian Andres Klode schreef op 14-11-2017 8:50:
>
> * You should not depend on grep, sed, coreutils, they are Essential.
>>
>
> Can I ask what this means?
>
> I actually assume that these dependencies are not *required*, not that you
> can't use the tools.

Required: yes. The highest priority. sysvinit was Required: yes until
systemd came along https://www.debian.org/doc/debian-policy/#priorities

Speaking of, I can't use 'apt-get indextargets' from shell and had to
rewrite in ruby, because sed doesn't not support lazy matching, and I don't
know how else to match NOT \n\n. (it also doesn't seem to support multiples
of submatches.) Old regular expression implementations are showing their
age (not to mention perl's non-regular features).
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Bug#881692: command-not-found: I re-wrote command-not-found

2017-11-16 Thread Shawn Landden
On Thu, Nov 16, 2017 at 6:44 PM, Colin Watson <cjwat...@ubuntu.com> wrote:

> On Thu, Nov 16, 2017 at 05:10:19PM -0800, Shawn Landden wrote:
> > On Mon, Nov 13, 2017 at 11:50 PM, Julian Andres Klode <j...@debian.org>
> > wrote:
> > > * It needs to be translated - also very important.
> >
> > I made a pot file and used translations from the python version, but I
> > can't get my app to look for translations (as examined through strace). I
> > read the gettext manual and do not know what I am doing wrong.
>
> Looking at
> https://github.com/shawnl/command-not-found/blob/master/
> command-not-found.c,
> your problem appears to be that you aren't calling setlocale().  You
> should normally call this before calling bindtextdomain() and
> textdomain():
>
>   setlocale(LC_ALL, "");
>
> (The gettext manual does cover this, but possibly you were looking at
> some different bit of it.)
>
Managed to re-use all the translations from launchpad of the existing
command-not-found.

>
> --
> Colin Watson   [cjwat...@ubuntu.com]
>
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Bug#881692: command-not-found: I re-wrote command-not-found

2017-11-16 Thread Shawn Landden
> * Did you test your format on a slow HDD with caches dropped? It
>>   must not be slower than the Python one (that one is way too slow
>>   already) - I did, it seems to be faster (0.4 vs 0.68 seconds)
>>   - I believe the database-based C rewrite was even much faster,
>>   though
>>
> I tested with kyotocabinet backend and it was slower with dropped caches
on a hard drive (1 second), which is the slow case I am most concerned
with. Small  makes a difference. The code is at
https://github.com/shawnl/command-not-found/tree/kyotocabinet
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss


Re: Bug#881692: command-not-found: I re-wrote command-not-found

2017-11-14 Thread Shawn Landden
On Nov 14, 2017 8:15 AM, "Julian Andres Klode"  wrote:

On Tue, Nov 14, 2017 at 03:35:02PM +, John Lenton wrote:
> On 14 November 2017 at 12:34, Julian Andres Klode  wrote:
> > On Tue, Nov 14, 2017 at 01:00:54PM +0100, Zygmunt Krynicki wrote:
> >> I would love if we have a compact representation of mapping from name
> >> to list of bits of information where each bit can be a small structure
> >> with some data. Apart from components for ubuntu archive it could be
> >> used to store facts about snap packages, flatpaks, etc. I would try to
> >> avoid a simplistic command -> package mapping as that will force us to
> >> encode things into strings in an ad-hoc way.
> >
> > That makes sense to me. But then we're back on a db, I guess. I sort
> > like this minimal approach.
>
> I was thinking in the other direction, was going to see how it behaved
> with sqlite as the store. Would that be objectionable?

Using a relational database for a simple key -> structure mapping seems
overkill and a mismatch for the problem, and the SQL does not make it
more readable.

I'd play with lmdb and kyotocabinet, these are two high-performance
key-value file databases and then encode a structure as mentioned
before.

I had some kyotocabinet code, (i maintain that package, which btw is in
mentors) but this way is at least half the size. (Kyotocabinet is 1mb and
it almost doubles the size of the db, even using lower overhead b-tree back
end. These entries are just very small.

For the text file approach, we can even go human, readable, like git:

git just encodes a number in a fixed-length decimal number, we can do
the same, and then just encode (length, key), (length, data) pairs after
each other (or as mentioned, just use the "index" as the field id, and
store field ids in the progrma). Uses a bit more space, but encodes
everything in a format you could read with a text editor, and should
not be terribly less efficient.

The thing is: This needs to be as efficient as possible: it should
be below 100ms (or better 50ms), regardless of whether caches are dropped
or not.

Python code |   Shawn's code

SSD, cache  50ms5ms
SSD, " dropped 256ms   15ms
HDD, cache  50ms5ms
HDD, " dropped 530ms   15ms

I guess Shawn's code could even be improved in performance by
avoiding the subprocess execution, avoiding various ld cache
lookups and library loads.

I am going to have to bring it in process to add the spell check code.


That said, space requirements might matter too.
--
Debian Developer - deb.li/jak | jak-linux.org - free software dev
Ubuntu Core Developer  de, en speaker
-- 
Ubuntu-devel-discuss mailing list
Ubuntu-devel-discuss@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss