Programing Challenge: Constructing a Tree Given Its Edges.
Programing Challenge: Constructing a Tree Given Its Edges. Show you are the boss. http://xahlee.info/perl-python/python_construct_tree_from_edge.html here's plain text. ── ── ── ── ── Problem: given a list of edges of a tree: [child, parent], construct the tree. Here's a sample input: [[0, 2], [3, 0], [1, 4], [2, 4]]. Here's a sample print of a tree data structure: 4 1 2 0 3 this means, the root node's name is 4. It has 2 branches, named 1 and 2. The branche named 2 has 1 children, named 0, and it has one children named 3. The node's names are given as integers, but you can assume them to be strings. You can choose your own data structure for the output. For example, nested list with 1st element being the node name, or use nested hash of hash, where key is the node name, and value is its children. Here's sample data structure in python, using hash of hash. { 4: { 1: {}, 2: { 0: { 3: {} } } } } Other data structure are accepted. Code it using your favorite language. I'll be able to look at and verify in Mathematica, Python, Ruby, Perl, PHP, JavaScript, Emacs Lisp, Java. But other langs, such as Clojure and other lisps, OCaml, Haskell, erlang, Fortress, APL, HOL, Coq, Lua, Factor, Rust, Julia … are very welcome. 〔☛ Proliferation of Computing Languages〕 You should be able to do this within 8 hours from scratch. Took me 5 hours. Python solution will be posted in a week, on 2014-01-14 (or sooner if many showed solutions). Post your solution in comment (or link to github or pastebin). -- https://mail.python.org/mailman/listinfo/python-list
Re: Learn Technical Writing from Unix Man in 10 Days
On Apr 29, 7:43 pm, Jason Earl je...@notengoamigos.org wrote: On Sat, Apr 28 2012, Steven D'Aprano wrote: On Sat, 28 Apr 2012 14:55:42 -0700, Xah Lee wrote: Learn Technical Writing from Unix Man in 10 Days Quote from man apt-get: remove remove is identical to install except that packages are removed instead of installed. Do you also expect the documentation to define except, instead, is, to and the? If you don't know what install and remove means, then you need an English dictionary, not a technical manual. It is considerably worse than that. If you look at what the documentation for apt-get actually says, instead of just the badly mangled version that Xah shares you would realize that the post was basically a bald-face troll. The rest of Xah's links in this particular article was even worse. For the most part he was criticizing documentation flaws that have disappeared years ago. Heck, his criticism of Emacs' missing documentation has been fixed since Emacs 21 (the Emacs developers are currently getting ready to release Emacs 24). His criticism of git's documentation is also grossly misleading. kernel.org still has the empty directories, but git-scm.org has been the official home for git's documentation for years. I am sure that the rest of the examples are just as ridiculous. I tend to like Xah's writing. Heck, I even sent a few bucks his way as thanks for his Emacs Lisp tutorials. However, that particular post was simply ridiculous. Jason jason, are you trolling me, or me you? ☺ Xah -- http://mail.python.org/mailman/listinfo/python-list
Learn Technical Writing from Unix Man in 10 Days
Learn Technical Writing from Unix Man in 10 Days Quote from man apt-get: remove remove is identical to install except that packages are removed instead of installed. Translation: kicking kicking is identical to kissing except that receiver is kicked instead of kissed. further readings: • 〈The Idiocy of Computer Language Docs〉 http://xahlee.org/comp/idiocy_of_comp_lang.html • 〈Why Open Source Documentation is of Low Quality〉 http://xahlee.org/UnixResource_dir/writ/gubni_papri.html • 〈Python Documentation Problems〉 http://xahlee.org/perl-python/python_doc_index.html DISAPPEARING URL IN DOC so, i was reading man git. Quote: Formatted and hyperlinked version of the latest git documentation can be viewed at http://www.kernel.org/pub/software/scm/git/docs/. but if you go to that url, it shows a list of over one hundred fourty empty dirs. I guess unix/linux idiots can't be bothered to have correct documentation. Inability to write is one thing, but they are unable to maintain a link or update doc? does this ever happens to Apple's docs? If it did, i don't ever recall seeing it from 18 years of using Mac. more records of careless dead link: • 〈Hackers: Dead Links and Human Compassion?〉 http://xahlee.org/comp/hacker_dead_links_and_compassion.html • 〈Why Qi Lisp Fails and Clojure Succeeds〉 http://xahlee.org/UnixResource_dir/writ/qi_lang_marketing.html • 〈unix: Hunspell Path Pain〉 http://xahlee.org/comp/hunspell_spell_path_pain.html • 〈Python Doc URL disappearance〉 http://xahlee.org/perl-python/python_doc_url_disappearance.html • 〈A Record of Frustration in IT Industry; Disappearing FSF URLs〉 http://xahlee.org/emacs/gnu_doc.html -- http://mail.python.org/mailman/listinfo/python-list
John Carmack glorifying functional programing in 3k words
John Carmack glorifying functional programing in 3k words http://www.altdevblogaday.com/2012/04/26/functional-programming-in-c/ where was he ten years ago? O, and btw, i heard that Common Lispers don't do functional programing, is that right? Fuck Common Lispers. Yeah, fuck them. One bunch of Fuckfaces. (and Fuck Pythoners. Python fucking idiots.) O, don't forget, 〈Programing: What are OOP's Jargons and Complexities (Object Oriented Program as Functional Program)〉 http://xahlee.org/Periodic_dosage_dir/t2/oop.html please you peruse of it. your servant, humbly Xah -- http://mail.python.org/mailman/listinfo/python-list
A Design Pattern Question for Functional Programers
Functional programing is getting the presses in mainstream. I ran across this dialogue where a imperative coder was trying to get into functional programing: A: What are the design patterns that help structure functional systems? B: Design patterns? Hey everyone, look at the muggle try to get the wand to work! from: 〈Code Watch: Functional programming's smugness problem〉 (2012-04-16) By By Larry O'brien. @ http://www.sdtimes.com/content/article.aspx?ArticleID=36534 hi, my dearly beloved C++ java perl python hackers, design pattern your mom! further readings: 〈Why Software Suck〉 http://xahlee.org/UnixResource_dir/writ/why_software_suck.html 〈What is a Tech Geeker?〉 http://xahlee.org/UnixResource_dir/writ/tech_geeker.html 〈Book Review: Patterns of Software〉 http://xahlee.org/PageTwo_dir/Personal_dir/bookReviewRichardGabriel.html 〈Are You Intelligent Enough to Understand HTML5?〉 http://xahlee.org/UnixResource_dir/writ/html5_vs_intelligence.html Xah -- http://mail.python.org/mailman/listinfo/python-list
Emacs Lisp vs Perl: Validate Local File Links
〈Emacs Lisp vs Perl: Validate Local File Links〉 http://xahlee.org/emacs/elisp_vs_perl_validate_links.html a comparison of 2 scripts. lots code, so i won't paste plain text version here. i have some comments at the bottom. Excerpt: -- «One thing interesting is to compare the approaches in perl and emacs lisp.» «For our case, regex is not powerful enough to deal with the problem by itself, due to the nested nature of html. This is why, in my perl code, i split the file by into segments first, then, use regex to deal with now the non-nested segment. This will break if you have a title=x href=z href=math.htmlmath/a. This cannot be worked around unless you really start to write a real parser.» «The elisp here is more powerful, not because of any lisp features, but because emacs's buffer datatype. You can think of it as a glorified string datatype, that you can move a cursor back and forth, or use regex to search forward or backward, or save cursor positions (index) and grab parts of text for further analysis.» -- If you are a perl coder, and disagree, let me know your opinion. (showing working code is very welcome) My comment about perl there applies to python too. (python code welcome too.) Xah -- http://mail.python.org/mailman/listinfo/python-list
f python?
hi guys, sorry am feeling a bit prolifit lately. today's show, is: 〈Fuck Python〉 http://xahlee.org/comp/fuck_python.html Fuck Python By Xah Lee, 2012-04-08 fuck Python. just fucking spend 2 hours and still going. here's the short story. so recently i switched to a Windows version of python. Now, Windows version takes path using win backslash, instead of cygwin slash. This fucking broke my find/replace scripts that takes a dir level as input. Because i was counting slashes. Ok no problem. My sloppiness. After all, my implementation wasn't portable. So, let's fix it. After a while, discovered there's the 「os.sep」. Ok, replace 「/」 to 「os.sep」, done. Then, bang, all hell went lose. Because, the backslash is used as escape in string, so any regex that manipulate path got fucked majorly. So, now you need to find a quoting mechanism. Then, fuck python doc incomprehensible scattered comp-sci-r-us BNF shit. Then, fuck python for “os.path” and “os” modules then string object and string functions inconsistent ball. And FUCK Guido who wants to fuck change python for his idiotic OOP concept of “elegance” so that some of these are deprecated. So after several exploration of “repr()”, “format()”, “‹str›.count()”, “os.path.normpath()”, “re.split()”, “len(re.search().group())” etc, after a long time, let's use “re.escape()”. 2 hours has passed. Also, discovered that “os.path.walk” is now deprecated, and one is supposed to use the sparkling “os.walk”. In the process of refreshing my python, the “os.path.walk” semantics is really one fucked up fuck. Meanwhile, the “os.walk” went into incomprehensible OOP object and iterators fuck. now, it's close to 3 hours. This fix is supposed to be done in 10 min. I'd have done it in elisp in just 10 minutes if not for my waywardness. This is Before def process_file(dummy, current_dir, file_list): current_dir_level = len(re.split(/, current_dir)) - len(re.split(/, input_dir)) cur_file_level = current_dir_level+1 if min_level = cur_file_level = max_level: for a_file in file_list: if re.search(r\.html$, a_file, re.U) and os.path.isfile(current_dir + / + a_file): replace_string_in_file(current_dir + / + a_file) This is After def process_file(dummy, current_dir, file_list): current_dir = os.path.normpath(current_dir) cur_dir_level = re.sub( ^ + re.escape(input_dir), , current_dir).count( os.sep) cur_file_level = cur_dir_level + 1 if min_level = cur_file_level = max_level: for a_file in file_list: if re.search(r\.html$, a_file, re.U) and os.path.isfile(current_dir + re.escape(os.sep) + a_file): replace_string_in_file(current_dir + os.sep + a_file) # print %d %s % (cur_file_level, (current_dir + os.sep + a_file)) Complete File # -*- coding: utf-8 -*- # Python # find replace strings in a dir import os, sys, shutil, re # if this this is not empty, then only these files will be processed my_files = [] input_dir = c:/Users/h3/web/xahlee_org/lojban/hrefgram2/ input_dir = /cygdrive/c/Users/h3/web/zz input_dir = c:/Users/h3/web/xahlee_org/ min_level = 2; # files and dirs inside input_dir are level 1. max_level = 2; # inclusive print_no_change = False find_replace_list = [ ( uiframe style=width:100%;border:none src=http://xahlee.org/ footer.html/iframe, uiframe style=width:100%;border:none src=../footer.html/ iframe, ), ] def replace_string_in_file(file_path): Replaces all findStr by repStr in file file_path temp_fname = file_path + ~lc~ backup_fname = file_path + ~bk~ # print reading:, file_path input_file = open(file_path, rb) file_content = unicode(input_file.read(), utf-8) input_file.close() num_replaced = 0 for a_pair in find_replace_list: num_replaced += file_content.count(a_pair[0]) output_text = file_content.replace(a_pair[0], a_pair[1]) file_content = output_text if num_replaced 0: print ◆ , num_replaced, , file_path.replace(\\, /) shutil.copy2(file_path, backup_fname) output_file = open(file_path, r+b) output_file.read() # we do this way instead of “os.rename” to preserve file creation date output_file.seek(0) output_file.write(output_text.encode(utf-8)) output_file.truncate() output_file.close() else: if print_no_change == True: print no change:, file_path # os.remove(file_path) # os.rename(temp_fname, file_path) def process_file(dummy, current_dir, file_list): current_dir = os.path.normpath(current_dir) cur_dir_level = re.sub( ^ + re.escape(input_dir), , current_dir).count( os.sep) cur_file_level = cur_dir_level + 1 if min_level = cur_file_level = max_level: for a_file in file_list: if re.search(r\.html$, a_file, re.U) and os.path.isfile(current_dir + re.escape(os.sep) + a_file): replace_string_in_file(current_dir + os.sep + a_file) # print %d %s
Re: f python?
On Apr 8, 4:34 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Sun, 08 Apr 2012 04:11:20 -0700, Xah Lee wrote: [...] I have read Xah Lee's post so that you don't have to. Shorter Xah Lee: I don't know Python very well, and rather than admit I made some pretty lousy design choices in my code, I blame Python. And then I cross-post about it, because I'm the most important person in the Universe. When the only tool you know how to use is a hammer, everything looks like a nail. Instead of using regexes (now you have two problems), use the right tool: to count path components, split the path, then count the number of path components directly. import os components = os.path.split(some_path) print len(components) No matter what separator the OS users, os.path.split will do the right thing. There's no need to mess about escaping separators so you can hammer it with the regex module, when Python comes with the perfectly functional socket-wrench you actually need. Lol. i think you tried to make fun of me too fast. check your code. O, was it you who made fun of my python tutorial before? i was busy, i'll have to get back on that down the road. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: f python?
Xah Lee wrote: « http://xahlee.org/comp/fuck_python.html » David Canzi wrote «When Microsoft created MS-DOS, they decided to use '\' as the separator in file names. This was at a time when several previously existing interactive operating systems were using '/' as the file name separator and at least one was using '\' as an escape character. As a result of Microsoft's decision to use '\' as the separator, people have had to do extra work to adapt programs written for Windows to run in non-Windows environments, and vice versa. People have had to do extra work to write software that is portable between these environments. People have done extra work while creating tools to make writing portable software easier. And people have to do extra work when they use these tools, because using them is still harder than writing portable code for operating systems that all used '/' as their separator would have been.» namekuseijin wrote: yes, absolutely. But you got 2 inaccuracies there: 1) Microsoft didn't create DOS; 2) fucking DOS was written in C, and guess what, it uses \ as escape character. Fucking microsoft. So, when you say fuck Python, are you sure you're shooting at the right target? I agree. Fuck winDOS and fucking microsoft. No. The choice to use backslash than slash is actually a good one. because, slash is one of the useful char, far more so than backslash. Users should be able to use that for file names. i don't know the detailed history of path separator, but if i were to blame, it's fuck unix. The entirety of unix, unix geek, unixers, unix fuckheads. Fuck unix. 〈On Unix Filename Characters Problem〉 http://xahlee.org/UnixResource_dir/writ/unix_filename_chars.html 〈On Unix File System's Case Sensitivity〉 http://xahlee.org/UnixResource_dir/_/fileCaseSens.html 〈UNIX Tar Problem: File Length Truncation, Unicode Name Support〉 http://xahlee.org/comp/unix_tar_problem.html 〈What Characters Are Not Allowed in File Names?〉 http://xahlee.org/mswin/allowed_chars_in_file_names.html 〈Unicode Support in File Names: Windows, Mac, Emacs, Unison, Rsync, USB, Zip〉 http://xahlee.org/mswin/unicode_support_file_names.html 〈The Nature of the Unix Philosophy〉 http://xahlee.org/UnixResource_dir/writ/unix_phil.html Xah -- http://mail.python.org/mailman/listinfo/python-list
how i loved lisp cons and UML and Agile and Design Patterns and Pythonic and KISS and YMMV and stopped worrying
OMG, how i loved lisp cons and macros and UML and Agile eXtreme Programing and Design Patterns and Anti-Patterns and Pythonic and KISS and YMMV and stopped worrying. 〈World Multiconference on Systemics, Cybernetics and Informatics???〉 http://xahlee.org/comp/WMSCI.html highly advanced plain text format follows, as a amenity for tech geekers. --- World Multiconference on Systemics, Cybernetics and Informatics ??? Xah Lee, 2010-04-04 Starting in 2004, i regularly receive email asking me to participate a conference, called “World Multiconference on Systemics, Cybernetics and Informatics” (WMSCI). Here's one of such email i got today: Dear Xah Lee: As you know the Nobel Laureate Herbert Simon affirmed that design is an essential ingredient of the Artificial Sciences Ranulph Glanville, president of the American Society for Cybernetics and expert in design theory, affirms that “Research is a variety of design. So do research as design. Design is key to research. Research has to be designed.” An increasing number of authors are stressing the relationships between Design and Research. Design is a mean for Research, and Research is a mean for Design. Design and research are related via cybernetic loops in the context of means-ends logic. Consequently, we invite you to submit a paper/abstract and/ot to organize an invited session in the International Symposium on Design and Research in the Artificial and the Natural Sciences: DRANS 2010 (http://www.sysconfer.org/drans) which is being organized in the context of The 14th World Multi- Conference on Systemics, Cybernetics and Informatics: WMSCI 2010 (http://www.sysconfer.org/wmsci), 2010 in Orlando, Florida, USA. … Here's the first email i got from them from my mail archive: From: sci2...@iiis.org Subject: Inviting you to participate in SCI 2005 Date: October 20, 2004 1:39:48 PM PDT To: x...@xahlee.org Dear Dr. Xah Lee: On behalf of the SCI 2005 Organizing Committee, I would like to invite you to participate in the 9th World Multi-Conference on Systemics, Cybernetics and Informatics (http://www.iiisci.org/sci2005), which will take place in Orlando, Florida, USA, from July 10-13, 2005. Full text wmsci.txt. I do not know this organization. I don't know how they got my email or how they know that i'm involved in the computer science community. (surely from trawling email addresses in science forums) Though, after getting a few of their emails, one clearly gets a sense that it is a scam, soliciting innocent idiotic academicians (many PhDs are idiots.). Here's what Wikipedia has to say about them: World Multiconference on Systemics, Cybernetics and Informatics. Here's a juicy quote: WMSCI attracted publicity of a less favorable sort in 2005 when three graduate students at MIT succeeded in getting a paper accepted as a “non-reviewed paper” to the conference that had been randomly generated by a computer program called SCIgen.[8] Documents generated by this software have been used to submit papers to other similar conferences. Compare to the Sokal affair. WMSCI has been accused of using spam to advertise its conferences.[8] Now and then, whenever i got their email, the curiosity in me do lookup the several terms they used in the email, partly to check the validity. For example, in this one, it mentions Herbert Simon. Another one i recall i got recently mentioned Science 2.0. Both of the terms i haven't heard of before. One'd think that it is easy to tell scam from real science, but with today's science proliferation, it's actually not that easy. Even if you are a academic, it's rather common that many new science terms you never heard of, because there are tremendous growth of new disciplines or cross disciplines, along with new jargons. Cross-discipline is rather common and natural, unlike in the past where science is more or less clearly delineated hierarchy like Physics, Math, Chemistry, Biology, etc and their sub-branches. However, many of today's new areas is a bit questionable, sometimes a deliberate money-making scheme, which i suppose is the case for WMSCI. Many of these, use terms like “post-modern”, “science 2.0” to excuse themselves from the rather strict judgment of classic science. Many of these terms such as “systemics”, “cybernetics”, “infomatics” are vague. Depending on the context, it could be a valid emerging science discipline, but it could also be pure new-age hogwash. And sometimes, nobody really knows today. Fledgling scientific fields may started off as pseudo-science but later became well accepted with more solid theories. (e.g. evolutionary psychology) In the past 2 decades, there are quite a few cases where peer reviewed papers published in respected journals are exposed as highly questionable or deliberate hoax, arose massive debate on the peer review system. The peer-review system itself can't hold all the blame, but part of it has to do with the incredible growth of sciences
Re: Is Programing Art or Science?
On Apr 3, 8:22 am, Rainer Weikusat rweiku...@mssgmbh.com wrote: Xah Lee xah...@gmail.com writes: [...] For example, “Is mathematics science or art?”, is the same type of question that has been broached by dabblers now and then. http://en.wikipedia.org/wiki/Liberal_arts this is the best reply in this thread! Xah -- http://mail.python.org/mailman/listinfo/python-list
Is Programing Art or Science?
the refreshen of the blood, from Xah's Entertainment Enterprise, i bring you: 〈Is Programing Art or Science〉 http://xahlee.org/UnixResource_dir/writ/art_or_science.html penned in the year of our lord two thousand and two, plain text version follows. Is Programing Art or Science? Dear friends, You mentioned the title of Donald Knuth's magnum opus Art of Programming in the context of discussion that fringes on whether programing is science or art. I'm quite pissed off at work at the moment, so let me take the time to give some guide on this matter to the daily programers. At the bottom rung of programers, there's no question about whether programing is science or art. Because monkey coders could not care less. These folks ain't be reading this post, for they hardly will have heard of lisp. This leaves us with elite programers who have a smattering of interests on cogitation and philosophical conundrums. So, is programing a science or art? For the programing morons, this question is associated with erudition. It certainly is a hip subject among hackers such as those hardcore Perl advocates and unix proponents, who would casually hint on such realization, impressing a sophistication among peers. Such a question is not uncommon among those curious. For example, “Is mathematics science or art?”, is the same type of question that has been broached by dabblers now and then. We can also detect such dilemma in the titles conferred to blathering computer jockeys: which one are thee: baccalaureate of science or baccalaureate of arts? It really makes no fucking difference. Ultimately, fantastically stupid questions like these are not discussed by mathematicians nor philosophers. These are natural language side-effects, trapping dummies to fuzz about nothing; not unlike quotations. For these computing jockeys, there remains the question of why Knuth named his books the “Art” of Computer Programing, or why some computing luminaries litter the caution that programing is as much a art as science. What elite dimwits need to realize is that these authors are not defining or correcting, but breaking precepts among the automatons in programing industry. To the readers of hip literature, words like science and art are spellbinding, and the need to pigeonhole is imminent. Of these ruminating class of people, the problem lies in their wanting of originality. What fills their banal brain are the stale food of thought that has been chewed and spewed. These above-average eggheads mop up the scholastic tidbits of its day to mull and muse with fellow eggheads. They could not see new perspectives. Could not understand gists. Could not detect non-questions. They are the holder and passer of knowledge, a bucket of pre-digested purees. Their train of thought forever loops around established tracks — going nowhere, anytime! So, is programing a art or science? is it art or science? I really need to know. • Theory vs Practice • Jargons of IT Industry • The Lambda Logo Tour • English Lawers PS don't forget to checkout: 〈From Why Not Ruby to Fuck Python, Hello Ruby〉 @ http://xahlee.org/UnixResource_dir/writ/why_not_Ruby.html yours humbly, Xah -- http://mail.python.org/mailman/listinfo/python-list
Google Tech Talk: lisp at JPL
Dearly beloved lisperati, I present you, Ron Garret (aka Erann Gat — aka Naggum hater and enemy of Kenny Tilton), at Google Tech Talk 〈The Remote Agent Experiment: Debugging Code from 60 Million Miles Away〉 Google Tech Talk, (2012-02-14) Presented by Ron Garret. @ http://www.youtube.com/watch?v=_gZK0tW8EhQ i just started watching, havn't done yet. (thx jcs's blog for the news) PS posted to python and perl forums too, because i think might be interesting for lang aficionados . Reply to just your community please. Xah -- http://mail.python.org/mailman/listinfo/python-list
perldoc: the key to perl
〈Perl Documentation: The Key to Perl〉 http://xahlee.org/perl-python/key_to_perl.html plain text follows - So, i wanted to know what the option perl -C does. So, here's perldoc perlrun. Excerpt: -C [*number/list*] The -C flag controls some of the Perl Unicode features. As of 5.8.1, the -C can be followed either by a number or a list of option letters. The letters, their numeric values, and effects are as follows; listing the letters is equal to summing the numbers. I 1 STDIN is assumed to be in UTF-8 O 2 STDOUT will be in UTF-8 E 4 STDERR will be in UTF-8 S 7 I + O + E i 8 UTF-8 is the default PerlIO layer for input streams o16 UTF-8 is the default PerlIO layer for output streams D24 i + o A32 the @ARGV elements are expected to be strings encoded in UTF-8 L64 normally the IOEioA are unconditional, the L makes them conditional on the locale environment variables (the LC_ALL, LC_TYPE, and LANG, in the order of decreasing precedence) -- if the variables indicate UTF-8, then the selected IOEioA are in effect a 256 Set ${^UTF8CACHE} to -1, to run the UTF-8 caching code in debugging mode. For example, -COE and -C6 will both turn on UTF-8-ness on both STDOUT and STDERR. Repeating letters is just redundant, not cumulative nor toggling. The io options mean that any subsequent open() (or similar I/O operations) in the current file scope will have the :utf8 PerlIO layer implicitly applied to them, in other words, UTF-8 is expected from any input stream, and UTF-8 is produced to any output stream. This is just the default, with explicit layers in open() and with binmode() one can manipulate streams as usual. -C on its own (not followed by any number or option list), or the empty string for the PERL_UNICODE environment variable, has the same effect as -CSDL. In other words, the standard I/ O handles and the default open() layer are UTF-8-fied *but* only if the locale environment variables indicate a UTF-8 locale. This behaviour follows the *implicit* (and problematic) UTF-8 behaviour of Perl 5.8.0. You can use -C0 (or 0 for PERL_UNICODE) to explicitly disable all the above Unicode features. The read-only magic variable ${^UNICODE} reflects the numeric value of this setting. This variable is set during Perl startup and is thereafter read-only. If you want runtime effects, use the three-arg open() (see open in perlfunc), the two-arg binmode() (see binmode in perlfunc), and the open pragma (see open). (In Perls earlier than 5.8.1 the -C switch was a Win32- only switch that enabled the use of Unicode-aware wide system call Win32 APIs. This feature was practically unused, however, and the command line switch was therefore recycled.) Note: Since perl 5.10.1, if the -C option is used on the #! line, it must be specified on the command line as well, since the standard streams are already set up at this point in the execution of the perl interpreter. You can also use binmode() to set the encoding of an I/O stream. reading that is like a adventure. It's like this: The -C is a key to unlock many secrets. Just get it, and you'll be all good to go, except in cases you may need the inner key. You'll find a hinge in the key, open it, then there's a subkey. On the subkey, there's a number. Take that number to the lock, it will open with keyX. When you use keyX, it must be matched with the previous inner key with 8th bit. keyX doesn't have a ID, but you can make one by finding the number at the place you found the key C. Key C is actually optional, but when inner key and keyX's number matches, it changes the nature of the lock. This is when you need to turn on keyMode … Xah -- http://mail.python.org/mailman/listinfo/python-list
a interesting Parallel Programing Problem: asciify-string
here's a interesting problem that we are discussing at comp.lang.lisp. 〈Parallel Programing Problem: asciify-string〉 http://xahlee.org/comp/parallel_programing_exercise_asciify-string.html here's the plain text. Code example is emacs lisp, but the problem is general. for a bit python relevancy… is there any python compiler that's parallel-algorithm aware? --- Parallel Programing Problem: asciify-string Here's a interesting parallel programing problem. Problem Description The task is to change this function so it's parallelable. (code example in emacs lisp) (defun asciify-string (inputStr) Make Unicode string into equivalent ASCII ones. (setq inputStr (replace-regexp-in-string á\\|à\\|â\\|ä a inputStr)) (setq inputStr (replace-regexp-in-string é\\|è\\|ê\\|ë e inputStr)) (setq inputStr (replace-regexp-in-string í\\|ì\\|î\\|ï i inputStr)) (setq inputStr (replace-regexp-in-string ó\\|ò\\|ô\\|ö o inputStr)) (setq inputStr (replace-regexp-in-string ú\\|ù\\|û\\|ü u inputStr)) inputStr ) Here's a more general description of the problem. You are given a Unicode text file that's a few peta bytes. For certain characters in the file, they need to be changed to different char. (For example of practical application, see: IDN homograph attack ◇ Duplicate characters in Unicode.) One easy solution is to simply use regex, as the above sample code, to search thru the file sequentially, and perform the transfrom of a particular set of chars, then repeat for each char chat needs to be changed. But your task is to use a algorithm parallelizable. That is, in a parallel-algorithm aware language (e.g. Fortress), the compiler will automatically span the computation to multiple processors. Refer to Guy Steele's video talk if you haven't seen already. See: Guy Steele on Parallel Programing. Solution Suggestions A better way to write it for parallel programing, is to map a char- transform function to each char in the string. Here's a pseudo-code in lisp by Helmut Eller: (defun asciify-char (c) (case c ((? ? ? ?) ?a) ((? ? ? ?) ?e) ((? ? ? ?) ?i) ((? ? ? ?) ?o) ((? ? ? ?) ?u) (t c))) (defun asciify-string (string) (map 'string #'asciify-string string)) One problem with this is that the function “asciify-char” itself is sequential, and not 100% parallelizable. (we might assume here that there are billions of chars in Unicode that needs to be transformed) It would be a interesting small project, if someone actually use a parallel-algorithm-aware language to work on this problem, and report on the break-point of file-size of parallel-algorithm vs sequential- algorithm. Anyone would try it? Perhaps in Fortress, Erlang, Ease, Alice, X10, or other? Is the Clojure parallel aware? Xah -- http://mail.python.org/mailman/listinfo/python-list
are int, float, long, double, side-effects of computer engineering?
some additional info i thought is relevant. are int, float, long, double, side-effects of computer engineering? Xah Lee wrote: «… One easy way to measure it is whether a programer can read and understand a program without having to delve into its idiosyncrasies. …» Chris Angelico wrote: «Neither the behavior of ints nor the behavior of IEEE floating point is a quirk or an idiosyncracy. …» they are computer engineering by-products. Are quirks and idiosyncracies. Check out a advanced lang such as Mathematica. There, one can learn how the mathematical concept of integer or real number are implemented in a computer language, without lots by-products of comp engineering as in vast majority of langs (all those that chalks up to some IEEE. (which, sadly, includes C, C++, perl, python, lisp, and almost all. (Common/Scheme lisp idiots speak of the jargon “number tower” instead I.) (part of the reason almost all langs stick to some I stuff is because it's kinda standard, and everyone understand it, in the sense that unix RFC (aka really fucking common) is wide-spread because its free yet technically worst. (in a sense, when everybody's stupid, there arise a cost to not be stupid.. A friend asked: «Can you enlighten us as to Mathematica's way of handling numbers, either by a post or a link to suitable documentation? …» what i meant to point out is that Mathematica deals with numbers at a high-level human way. That is, one doesn't think in terms of float, long, int, double. These words are never mentioned. Instead, you have concepts of machine precision, accuracy. The lang automatically handle the translation to hardware, and invoking exact value or infinite precision as required or requested. in most lang's doc, words like int, long, double, float are part of the lang, and it's quick to mention IEEE. Then you have the wide- spread overflow issue in your lang. In M, the programer only need to think in terms of math, i.e. Real number, Integer, complex number, precision, accuracy, etc. this is what i meat that most lang deals with computer engineering by- products, and i wished them to be higher level like M. http://reference.wolfram.com/mathematica/guide/PrecisionAndAccuracyControl.html Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: are int, float, long, double, side-effects of computer engineering?
On Mar 5, 9:26 pm, Tim Roberts t...@probo.com wrote: Xah Lee xah...@gmail.com wrote: some additional info i thought is relevant. are int, float, long, double, side-effects of computer engineering? Of course they are. Such concepts violate the purity of a computer language's abstraction of the underlying hardware. We accept that violation because of performance reasons. There are, as you point out, languages that do maintain the purity of the abstraction, but that purity is ALWAYS at the expense of performance. I would also point out pre-emptively that there is nothing inherently wrong with asking us to accept an impure abstraction in exchange for performance. It is a performance choice that we choose to make. while what you said is true, but the problem is that 99.99% of programers do NOT know this. They do not know Mathematica. They've never seen a language with such feature. The concept is alien. This is what i'd like to point out and spread awareness. also, argument about raw speed and fine control vs automatic management, rots with time. Happened with auto memory management, managed code, compilers, auto type conversion, auto extension of array, auto type system, dynamic/scripting languages, etc. i'd share this these talks: 〈Programing Language: Steve Yegge on Dynamic Languages〉 http://xahlee.org/comp/Steve_Yegge_on_dynamic_languages.html 〈Guy Steele on Parallel Programing: Get rid of cons!〉 http://xahlee.org/comp/Guy_Steele_parallel_computing.html 〈Ocaml Use in Industry (Janestreet Talk by Yaron Minsky)〉 http://xahlee.org/comp/Yaron_Minsky_Janestreet_talk.html 〈Stephen Wolfram: The Background and Vision of Mathematica 〉 http://xahlee.blogspot.com/2011/10/stephen-wolfram-background-and-vision.html Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: lang comparison: in-place algorithm for reversing a list in Perl,Python, Lisp
Xah Lee wrote: «… One easy way to measure it is whether a programer can read and understand a program without having to delve into its idiosyncrasies. …» Chris Angelico wrote: «Neither the behavior of ints nor the behavior of IEEE floating point is a quirk or an idiosyncracy. …» they are computer engineering by-products. Are quirks and idiosyncracies. Check out a advanced lang such as Mathematica. There, one can learn how the mathematical concept of integer or real number are implemented in a computer language, without lots by-products of comp engineering as in vast majority of langs (all those that chalks up to some IEEE. (which, sadly, includes C, C++, java, perl, python, lisp, and almost all. (lisp idiots speak of the jargon “number tower” instead I.) (part of the reason almost all langs stick to some I stuff is because it's kinda standard, and everyone understand it, in the sense that unix RFC (aka really fucking common) is wide-spread because its free yet technically worst. (in a sense, when everybody's stupid, there arise a cost to not be stupid.. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: New Science Discovery: Perl Idiots Remain Idiots After A Decade!New Science Discovery: Perl Idiots Remain Idiots After A Decade!
On Mar 1, 3:00 am, Kiuhnm kiuhnm03.4t.yahoo.it wrote: They did not make up the terminology, if that is what you are saying. The concepts of left and right associativity are well-known and accepted in TCS (Theoretical CS). Aho, Sethi and Ullman explain it this way in Compilers: Principles, Techniques and Tools: We say that the operator + associates to the left because an operand with plus signs on both sides of it is taken by the operator to its left. [...] And they also show parse trees similar to the ones I wrote above. how do they explain when 2 operators are adjacent e.g. 「3 △ 6 ▲ 5 」? do you happen to know some site that shows the relevant page i can have a look? thanks. Xah On Mar 1, 3:00 am, Kiuhnm kiuhnm03.4t.yahoo.it wrote: On 3/1/2012 1:02, Xah Lee wrote: i missed a point in my original post. That is, when the same operator are adjacent. e.g. 「3 ▲ 6 ▲ 5」. This is pointed out by Kiuhnm 〔kiuhnm03.4t.yahoo.it〕 and Tim Bradshaw. Thanks. though, i disagree the way they expressed it, or any sense this is different from math. They did not make up the terminology, if that is what you are saying. The concepts of left and right associativity are well-known and accepted in TCS (Theoretical CS). If you change the terminology, no one will understand you unless you provide your definitions every time (and then they may not accept them). Another way of saying that an operator is left-associative is that its parse tree is a left-tree, i.e. a complete tree where each right child is a leaf. For instance, (use a monospaced font) 1 + 2 + 3 + 4 gives you this left-tree: + + 4 + 3 1 2 while 1**2**3**4 gives you this right-tree: ** 1 ** 2 ** 3 4 Aho, Sethi and Ullman explain it this way in Compilers: Principles, Techniques and Tools: We say that the operator + associates to the left because an operand with plus signs on both sides of it is taken by the operator to its left. [...] And they also show parse trees similar to the ones I wrote above. Kiuhnm -- http://mail.python.org/mailman/listinfo/python-list
Re: lang comparison: in-place algorithm for reversing a list in Perl,Python, Lisp
On Mar 1, 7:04 am, Kaz Kylheku k...@kylheku.com wrote: lisp: (floor (/ x y)) --[rewrite]-- (floor x y) Thanks for this interesting point. I don't think it's a good lang design, more of a lang quirk. similarly, in Python 2.x, x/y will work when both x and y are integers. Also, x//y works too, but that // is just perlish unreadable syntax quirk. similarly, in perl, either one require POSIX; floor(x/y); the require POSIX instead of Math is a quirk. But even, floor should really be builtin. or using a perl hack int(x/y) all of the above are quirks. They rely on computer engineering by- products (such as int), or rely on the lang's idiosyncrasy. One easy way to measure it is whether a programer can read and understand a program without having to delve into its idiosyncrasies. Problem with these lang idioms is that it's harder to understand, and whatever advantage/optimization they provide is microscopic and temporary. best is really floor(x/y). idiomatic programing, is a bad thing. It was spread by perl, of course, in the 1990s. Idiomatic lang, i.e. lang with huge number of bizarre idioms, such as perl, is the worst. Xah -- http://mail.python.org/mailman/listinfo/python-list
New Science Discovery: Perl Idiots Remain Idiots After A Decade!New Science Discovery: Perl Idiots Remain Idiots After A Decade!
New Science Discovery: Perl Idiots Remain Idiots After A Decade! A excerpt from the new book 〈Modern Perl〉, just published, chapter 4 on “Operators”. Quote: «The associativity of an operator governs whether it evaluates from left to right or right to left. Addition is left associative, such that 2 + 3 + 4 evaluates 2 + 3 first, then adds 4 to the result. Exponentiation is right associative, such that 2 ** 3 ** 4 evaluates 3 ** 4 first, then raises 2 to the 81st power. » LOL. Looks like the perl folks haven't changed. Fundamentals of serious math got botched so badly. Let me explain the idiocy. It says “The associativity of an operator governs whether it evaluates from left to right or right to left.”. Ok, so let's say we have 2 operators: a white triangle △ and a black triangle ▲. Now, by the perl's teaching above, let's suppose the white triangle is “right associative” and the black triangle is “left associative”. Now, look at this: 3 △ 6 ▲ 5 seems like the white and black triangles are going to draw a pistol and fight for the chick 6 there. LOL. Now, let me tell you what operator precedence is. First of all, let's limit ourselfs to discuss operators that are so-called binary operators, which, in our context, basically means single symbol operator that takes it's left and right side as operands. Now, each symbol have a “precedence”, or in other words, the set of operators has a order. (one easy way to think of this is that, suppose you have n symbols, then you give each a number, from 1 to n, as their order) So, when 2 symbols are placed side by side such as 「3 △ 6 ▲ 5」, the symbol with higher precedence wins. Another easy way to think of this is that each operator has a stickiness level. The higher its level, it more sticky it is. the problem with the perl explanations is that it's one misleading confusion ball. It isn't about “left/right associativity”. It isn't about “evaluates from left to right or right to left”. Worse, the word “associativity” is a math term that describe a property of algebra that has nothing to do with operator precedence, yet is easily confused with because it is a property about order of evaluation. (for example, the addition function is associative, meaning: 「(3+6)+5 = 3+(6+5)」.) compare it with this: 〈Perl & Python: Complex Numbers〉 http://xahlee.org/perl-python/complex_numbers.html and for a good understanding of functions and operators, see: 〈What's Function, What's Operator?〉 http://xahlee.org/math/function_and_operators.html -- http://mail.python.org/mailman/listinfo/python-list
Re: New Science Discovery: Perl Idiots Remain Idiots After A Decade!New Science Discovery: Perl Idiots Remain Idiots After A Decade!
i missed a point in my original post. That is, when the same operator are adjacent. e.g. 「3 ▲ 6 ▲ 5」. This is pointed out by Kiuhnm 〔kiuhnm03.4t.yahoo.it〕 and Tim Bradshaw. Thanks. though, i disagree the way they expressed it, or any sense this is different from math. to clarify, amend my original post, here's what's needed for binary operator precedence: ① the symbols are ordered. (e.g. given a unique integer) ② each symbol is has either one of left-side stickness or right-side stickness spec. (needed when adjacent symbols are the same.) About the lisp case mentioned by Tim, e.g. in「(f a b c)」, whether it means 「(f (f a b) c)」 or 「(f a (f b c))」 . It is not directly relevant to the context of my original post, because it isn't about to operators. It's about function argument eval order. Good point, nevertheless. the perl doc, is still misleading, terribly bad written. Becha ass! Xah On Feb 29, 4:08 am, Kiuhnm kiuhnm03.4t.yahoo.it wrote: On 2/29/2012 9:09, Xah Lee wrote: New Science Discovery: Perl Idiots Remain Idiots After A Decade! A excerpt from the new book 〈Modern Perl〉, just published, chapter 4 on “Operators”. Quote: «The associativity of an operator governs whether it evaluates from left to right or right to left. Addition is left associative, such that 2 + 3 + 4 evaluates 2 + 3 first, then adds 4 to the result. Exponentiation is right associative, such that 2 ** 3 ** 4 evaluates 3 ** 4 first, then raises 2 to the 81st power. » LOL. Looks like the perl folks haven't changed. Fundamentals of serious math got botched so badly. Let me explain the idiocy. It says “The associativity of an operator governs whether it evaluates from left to right or right to left.”. Ok, so let's say we have 2 operators: a white triangle △ and a black triangle ▲. Now, by the perl's teaching above, let's suppose the white triangle is “right associative” and the black triangle is “left associative”. Now, look at this: 3 △ 6 ▲ 5 seems like the white and black triangles are going to draw a pistol and fight for the chick 6 there. LOL. Sorry, but you're wrong and they're right. Associativity governs the order of evaluation of a group of operators *OF THE SAME PRECEDENCE*. If you write 2**3**4 only the fact the '**' is right associative will tell you that the order is 2**(3**4) and not (2**3)**4 I remind you that 2^(3^4) != (2^3)^4. Kiuhnm -- http://mail.python.org/mailman/listinfo/python-list
lang comparison: in-place algorithm for reversing a list in Perl, Python, Lisp
fun example. in-place algorithm for reversing a list in Perl, Python, Lisp http://xahlee.org/comp/in-place_algorithm.html plain text follows What's “In-place Algorithm”? Xah Lee, 2012-02-29 This page tells you what's “In-place algorithm”, using {python, perl, emacs lisp} code to illustrate. Here's Wikipedia In-place algorithm excerpt: In computer science, an in-place algorithm (or in Latin in situ) is an algorithm which transforms input using a data structure with a small, constant amount of extra storage space. The input is usually overwritten by the output as the algorithm executes. An algorithm which is not in-place is sometimes called not-in-place or out-of- place. Python Here's a python code for reversing a list. Done by creating a new list, NOT using in-place: # python # reverse a list list_a = [a, b, c, d, e, f, g] list_length = len(list_a) list_b = [0] * list_length for i in range(list_length): list_b[i] = list_a[list_length -1 - i] print list_b Here's in-place algorithm for reversing a list: # python # in-place algorithm for reversing a list list_a = [a, b, c, d, e, f, g] list_length = len(list_a) for i in range(list_length/2): x = list_a[i] list_a[i] = list_a[ list_length -1 - i] list_a[ list_length -1 - i] = x print list_a Perl Here's a perl code for reversing a list. Done by creating a new list, NOT using in-place: # perl use strict; use Data::Dumper; my @listA = qw(a b c d e f g); my $listLength = scalar @listA; my @listB = (); for ( my $i = 0; $i $listLength; $i++ ) { $listB[$i] = $listA[ $listLength - 1 - $i]; } print Dumper(\@listB); # perl # in-place algorithm for reversing a list. use strict; use Data::Dumper; use POSIX; # for “floor” my @listA = qw(a b c d e f g); my $listLength = scalar @listA; for ( my $i = 0; $i floor($listLength/2); $i++ ) { my $x = $listA[$i]; $listA[$i] = $listA[ $listLength - 1 - $i]; $listA[ $listLength - 1 - $i] = $x; } print Dumper(\@listA); __END__ emacs lisp ;; emacs lisp ;; reverse a array (setq arrayA [a b c d e f g]) (setq arrayLength (length arrayA)) (setq arrayB (make-vector arrayLength 0)) (dotimes (i arrayLength ) (aset arrayB i (aref arrayA (- (1- arrayLength) i)) ) ) (print (format %S arrayB)) ;; emacs lisp ;; in-place algorithm for reversing a array (setq arrayA [a b c d e f g]) (setq arrayLength (length arrayA)) (dotimes (i (floor (/ arrayLength 2))) (let (x) (setq x (aref arrayA i)) (aset arrayA i (aref arrayA (- (1- arrayLength) i))) (aset arrayA (- (1- arrayLength) i) x) ) ) (print (format %S arrayA)) Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: lang comparison: in-place algorithm for reversing a list in Perl, Python, Lisp
On Feb 29, 9:01 pm, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: You don't need a temporary variable to swap two values in Python. A better way to reverse a list using more Pythonic idioms is: for i in range(len(list_a)//2): list_a[i], list_a[-i-1] = list_a[-i-1], list_a[i] forgive me sir, but i haven't been at python for a while. :) i was, actually, refreshing myself of what little polyglot skills i have. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Questions about LISP and Python.
On Dec 5, 4:31 am, Tim Bradshaw t...@tfeb.org wrote: On 2011-12-05 11:51:11 +, Xah Lee said: python has more readible syntax, more modern computer language concepts, and more robust libraries. These qualities in turn made it popular. Yet you still post here: why? i don't like python, and i prefer emacs lisp. The primary reason is that python is not functional, especially with python 3. The python community is full of fanatics with their drivels. In that respect, it's not unlike Common Lisp community and Scheme lisp community. see also: 〈Python Documentation Problems〉 http://xahlee.org/perl-python/python_doc_index.html 〈Computer Language Design: What's List Comprehension and Why is It Harmful?〉 http://xahlee.org/comp/list_comprehension.html 〈Lambda in Python 3000〉 http://xahlee.org/perl-python/python_3000.html 〈What Languages to Hate〉 http://xahlee.org/UnixResource_dir/writ/language_to_hate.html 〈Xah on Programing Languages〉 http://xahlee.org/Periodic_dosage_dir/comp_lang.html Xah -- http://mail.python.org/mailman/listinfo/python-list
Programing Language: latitude-longitude-decimalize
fun programing exercise. Write a function “latitude-longitude- decimalize”. It should take a string like this: 「37°26′36.42″N 06°15′14.28″W」. The return value should be a pair of numbers, like this: 「[37.44345 -6.25396]」. Feel free to use perl, python, ruby, lisp, etc. I'll post a emacs lisp solution in a couple of days. Xah -- http://mail.python.org/mailman/listinfo/python-list
question about speed of sequential string replacement vs regex or
curious question. suppose you have 300 different strings and they need all be replaced to say aaa. is it faster to replace each one sequentially (i.e. replace first string to aaa, then do the 2nd, 3rd,...) , or is it faster to use a regex with “or” them all and do replace one shot? (i.e. 1ststr|2ndstr|3rdstr|... - aaa) let's say the sourceString this replacement to be done on is 500k chars. Anyone? i suppose the answer will be similar for perl, python, ruby. btw, the origin of this question is about writing a emacs lisp function that replace ~250 html named entities to unicode char. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: question about speed of sequential string replacement vs regex or
On Sep 28, 3:57 am, mer...@stonehenge.com (Randal L. Schwartz) wrote: Xah == Xah Lee xah...@gmail.com writes: Xah curious question. Xah suppose you have 300 different strings and they need all be replaced Xah to say aaa. And then suppose this isn't the *real* question, but one entirely of Fiction by Xah Lee. How helpful do you want to be? it's a interesting question anyway. the question originally came from when i was coding elisp of a function that changes html entities to unicode char literal. The problem is slightly complicated, involving a few questions about speed in emacs. e.g. string vs buffer, and much more... i spent several hours on this but it's probably too boring to detail (but i'll do so if anyone wishes). But anyway, while digging these questions that's not clear in my mind, i thought of why not generate a regex or construct and do it in one shot, and wondered if that'd be faster. But afterwards, i realized this wouldn't be applicable to my problem because for my problem each string needs to be changed to a unique string, not all to the same string. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: question about speed of sequential string replacement vs regex or
here's more detail about the origin of this problem. Relevant to emacs lisp only. -- in the definition of “replace-regexp-in-string”, there's this comment: ;; To avoid excessive consing from multiple matches in long strings, ;; don't just call `replace-match' continually. Walk down the ;; string looking for matches of REGEXP and building up a (reversed) ;; list MATCHES. This comprises segments of STRING which weren't ;; matched interspersed with replacements for segments that were. ;; [For a `large' number of replacements it's more efficient to ;; operate in a temporary buffer; we can't tell from the function's ;; args whether to choose the buffer-based implementation, though it ;; might be reasonable to do so for long enough STRING.] note: «For a `large' number of replacements it's more efficient to operate in a temporary buffer». my question is, anyone got some idea, how “large” is large? currently, i have a function replace-pairs-in-string which is implemented by repeatedly calling “replace-pairs-in-string” like this: (while ( ii (length pairs)) (setq mystr (replace-regexp-in-string (elt tempMapPoints ii) (elt (elt pairs ii) 1) mystr t t)) (setq ii (1+ ii)) ) When there are 260 pairs of strings to be replaced on a file that's 26k in size, my function takes about 3 seconds (which i think is too slow). I'm at pain deciding whether my function should be implemented like this or whether it should create a temp buffer. The problem with temp buffer is that, if you repeatedly call it, the overhead of creating buffer is going to make it much slower. i was actually surprised that replace-regexp-in-string isn't written in C, which i thought it was. is there technical reason the replace-regexp-in-string isn't C? (i suppose only because nobody every did it and the need for speed didn't suffice?) and also, shouldn't there also be a replace-in-string (literal, not regex)? because i thought implementing replacement for string should be much simpler and faster, because buffers comes with it a whole structure such as “point”, text properties, buffer names, buffier modifier, etc. Xah On Sep 28, 5:28 am, Xah Lee xah...@gmail.com wrote: On Sep 28, 3:57 am, mer...@stonehenge.com (Randal L. Schwartz) wrote: Xah == Xah Lee xah...@gmail.com writes: Xah curious question. Xah suppose you have 300 different strings and they need all be replaced Xah to say aaa. And then suppose this isn't the *real* question, but one entirely of Fiction by Xah Lee. How helpful do you want to be? it's a interesting question anyway. the question originally came from when i was coding elisp of a function that changes html entities to unicode char literal. The problem is slightly complicated, involving a few questions about speed in emacs. e.g. string vs buffer, and much more... i spent several hours on this but it's probably too boring to detail (but i'll do so if anyone wishes). But anyway, while digging these questions that's not clear in my mind, i thought of why not generate a regex or construct and do it in one shot, and wondered if that'd be faster. But afterwards, i realized this wouldn't be applicable to my problem because for my problem each string needs to be changed to a unique string, not all to the same string. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: What Programing Language are the Largest Website Written In?
On Jul 31, 11:38 am, gavino gavcom...@gmail.com wrote: On Jul 13, 1:04 pm, ccc31807 carte...@gmail.com wrote: On Jul 12, 7:54 am, Xah Lee xah...@gmail.com wrote: maybe this will be of interest. 〈What Programing Language Are the Largest Website Written In?〉http://xahlee.org/comp/website_lang_popularity.html About five years ago, I did some pretty extensive research, and concluded that the biggest sites were written serverside with JSP. Obviously, this wouldn't include The Biggest site, but if you were a big, multinational corporation, or listed on the NYSE, you used JSP for your web programming. I doubt very seriously PHP is used for the biggest sites -- I'd still guess JSP, or maybe a MS technology (not VB), but it's only a guess. CC. facebook is php myspace is microsoft aol was tcl and aolserver c embedding tcl interp priceline is lisp reddit is python was lisp orig amazon was perl livejournal was perl thanks Kevin. Rarely seen you useful. :) Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 19, 11:14 am, Thomas Jollans t...@jollybox.de wrote: I thought I'd have some fun with multi-processing: Nice joke. ☺ Here's a sane version: https://gist.github.com/1087682/2240a0834463d490c29ed0f794ad15128849ff8e hi thomas, i still cant get your code to work. I have a dir named xxdir with a single test file xx.txt,with this content: foo[(])bar when i run your code py3 validate_brackets_Thomas_Jollans_2.py it simply exit and doesn't seem to do anything. I modded your code to print the file name it's proccessing. Apparently it did process it. my python isn't strong else i'd dive in. Thanks. I'm on Python 3.2.1. Here's a shell log: h3@H3-HP 2011-07-21 05:20:30 ~/web/xxst/find_elisp/validate matching brackets py3 validate_brackets_Thomas_Jollans_2.py h3@H3-HP 2011-07-21 05:20:34 ~/web/xxst/find_elisp/validate matching brackets py3 validate_brackets_Thomas_Jollans_2.py c:/Users/h3/web/xxst/find_elisp/validate matching brackets/xxdir \xx.txt h3@H3-HP 2011-07-21 05:21:59 ~/web/xxst/find_elisp/validate matching brackets py3 --version Python 3.2.1 h3@H3-HP 2011-07-21 05:27:03 ~/web/xxst/find_elisp/validate matching brackets Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 19, 11:07 am, Thomas Jollans t...@jollybox.de wrote: On 19/07/11 18:54, Xah Lee wrote: On Sunday, July 17, 2011 2:48:42 AM UTC-7, Raymond Hettinger wrote: On Jul 17, 12:47 am, Xah Lee xah...@gmail.com wrote: i hope you'll participate. Just post solution here. Thanks. http://pastebin.com/7hU20NNL just installed py3. there seems to be a bug. in this file http://xahlee.org/p/time_machine/tm-ch04.html there's a mismatched double curly quote. at position 28319. the python code above doesn't seem to spot it? here's the elisp script output when run on that dir: Error file: c:/Users/h3/web/xahlee_org/p/time_machine/tm-ch04.html [“ 28319] Done deal! That script doesn't check that the balance is zero at the end of file. Patch: --- ../xah-raymond-old.py 2011-07-19 20:05:13.0 +0200 +++ ../xah-raymond.py 2011-07-19 20:03:14.0 +0200 @@ -16,6 +16,8 @@ elif c in closers: if not stack or c != stack.pop(): return i + if stack: + return i return -1 def scan(directory, encoding='utf-8'): Thanks a lot for the fix Raymond. Though, the code seems to have a minor problem. It works, but the report is wrong. e.g. output: 30068: c:/Users/h3/web/xahlee_org/p/time_machine\tm-ch04.html that 30068 position is the last char in the file. The correct should be 28319. (or at least point somewhere in the file at a bracket char that doesn't match.) Today, i tried 3 more scripts. 2 fixed python3 versions, 1 ruby, all failed again. I've reported the problems i encounter at python or ruby newsgroups. If you are the author, a fix is very much appreciated. I'll get back to your code and eventually do a blog of summary of all different lang versions. Am off to test that elaborate perl regex now... cross fingers. Xah. Mood: quite discouraged. -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
2011-07-21 On Jul 18, 12:09 am, Rouslan Korneychuk rousl...@msn.com wrote: I don't know why, but I just had to try it (even though I don't usually use Perl and had to look up a lot of stuff). I came up with this: /(?| (\()(?matched)([\}\]”›»】〉》」』]|$) | (\{)(?matched)([\)\]”›»】〉》」』]|$) | (\[)(?matched)([\)\}”›»】〉》」』]|$) | (“)(?matched)([\)\}\]›»】〉》」』]|$) | (‹)(?matched)([\)\}\]”»】〉》」』]|$) | («)(?matched)([\)\}\]”›】〉》」』]|$) | (【)(?matched)([\)\}\]”›»〉》」』]|$) | (〈)(?matched)([\)\}\]”›»】》」』]|$) | (《)(?matched)([\)\}\]”›»】〉」』]|$) | (「)(?matched)([\)\}\]”›»】〉》』]|$) | (『)(?matched)([\)\}\]”›»】〉》」]|$)) (?(DEFINE)(?matched(?: \((?matched)\) | \{(?matched)\} | \[(?matched)\] | “(?matched)” | ‹(?matched)› | «(?matched)» | 【(?matched)】 | 〈(?matched)〉 | 《(?matched)》 | 「(?matched)」 | 『(?matched)』 | [^\(\{\[“‹«【〈《「『\)\}\]”›»】〉》」』]++)*+)) /sx; If the pattern matches, there is a mismatched bracket. $1 is set to the mismatched opening bracket. $-[1] is its location. $2 is the mismatched closing bracket or '' if the bracket was never closed. $-[2] is set to the location of the closing bracket or the end of the string if the bracket wasn't closed. I didn't write all that manually; it was generated with this: my @open = ('\(','\{','\[','“','‹','«','【','〈','《','「','『'); my @close = ('\)','\}','\]','”','›','»','】','〉','》','」','』'); '(?|'.join('|',map {'('.$open[$_].')(?matched)(['.join('',@close[0..($_-1),($_+1)..$#close]). ']|$)'} (0 .. $#open)).')(?(DEFINE)(?matched(?:'.join('|',map {$open[$_].'(?matched)'.$close[$_]} (0 .. $#open)).'|[^'.join('',@open,@close).']++)*+))' Thanks for the code. are you willing to make it complete and standalone? i.e. i can run it like this: perl Rouslan_Korneychuk.pl dirPath and it prints any file that has mismatched pair and line/column number or the char position? i'd do it myself but so far i tried 5 codes, 3 fixes, all failed. Not a complain, but it does take time to gather the code, of different langs by different people, properly document their authors and original source urls, etc, and test it out on my envirenment. All together in the past 3 days i spent perhaps a total of 4 hours running several code and writing back etc and so far not one really worked. i know perl well, but your code is a bit out of the ordinary ☺. If past days have been good experience, i might dive in and study for fun. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
Ok. Here's a preliminary report. 〈Lisp, Python, Perl, Ruby … Code to Validate Matching Brackets〉 http://xahlee.org/comp/validate_matching_brackets.html it's taking too much time to go thru. right now, i consider only one valid code, by Raymond Hettinger (with minor edit from others). right now, there's 2 other possible correct solution. One by Robert Klemme but requires ruby19 but i only have ruby18x. One by Thomas Jollans in Python 3 but didn't run on my machine perhaps due to some unix/Windows issue, yet to be done. the other 3 or 4 seems to be incomplete or just suggestion of ideas. i haven't done extensive testing on my own code neither. I'll revisit maybe in a few days. Feel free to grab my report and make it nice. If you would like to fix your code, feel free to email. Xah On Jul 21, 7:26 am, Ian Kelly ian.g.ke...@gmail.com wrote: On Thu, Jul 21, 2011 at 6:58 AM, Xah Lee xah...@gmail.com wrote: Thanks a lot for the fix Raymond. That fix was from Thomas Jollans, not Raymond Hettinger. Though, the code seems to have a minor problem. It works, but the report is wrong. e.g. output: 30068: c:/Users/h3/web/xahlee_org/p/time_machine\tm-ch04.html that 30068 position is the last char in the file. The correct should be 28319. (or at least point somewhere in the file at a bracket char that doesn't match.) Previously you wrote: If a file has mismatched matching-pairs, the script will display the file name, and the line number and column number of the first instance where a mismatched bracket occures. (or, just the char number instead (as in emacs's “point”)) I submit that as the file contains no mismatched brackets (only an orphan bracket), the output is correct to specification (indeed you did not define any output for this case), if not necessarily useful. In other words, stop being picky. You may be willing to spend an hour or moe on this, but that doesn't mean anybody else is. Raymond gave you a basically working Python solution, but forgot one detail. Thomas fixed that detail for you but didn't invest the time to rewrite somebody else's function to get the output correct. Continuing to harp on it at this point is verging on trolling. -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 21, 9:43 am, pyt...@bdurham.com wrote: Xah, 1. Is the following string considered legal? [ { ( ] ) } Note: Each type of brace opens and closes in the proper sequence. But inter-brace opening and closing does not make sense. nu! Or must a closing brace always balance out with the most recent opening brace like so? [ { ( ) } ] yeah! 2. If there are multiple unclosed braces at EOF, is the answer you're looking for the position of the first open brace that hasn't been closed out yet? well, as many pointed out, i really haven't thought it out well. originally, i just want to know the position of a un-matched char. i haven't taken the time to think about what really should be the desired behavior. For me, the problem started because i wanted to use the script to check my 5k html files, in particular, classic novels that involves double curly quotes and french quotes. So, the desired behavior is one based on the question of what would best for the user to see in order to correct a bracket mismatch error in a file. (which, can get quite complex for nested syntax, because, usually, once you have one missed, it's all hell from there. I think this is similar to the problem when a compiler/interpreter encounters a bad syntax in source code, and thus the poplar situation where error code of computer programs are hard to understand...) but anyway, just for this exercise, the requirement needn't be stringent. I still think that at least the reported position should be a matching char in the file. (and if we presume this, then only my code works. LOL) PS this is a warmup problem for writing a HTML tag validator. I looked high and lo in past years, but just couldn't find a script that does simple validation in batch. The w3c one is based on SGML, really huge amount of un-unstandable irregular historical baggage. XML lexical validator is much closer, but still not regular. I simply wanted one just like the match-pair validator in our problem, except the opening char is not a single char but string of the form xyz … and the *matching* closing one is of the form /xyz, and with just one exception: when a tag has “/” in ending such as br/ then it is skipped (i.e. not considered as opening or closing). I'll be writing this soon in elisp… since i haven't studied parsers, i had hopes that parser expert would show some proper parser solutions… in particular i think such can be expressed in Parsing Expression Grammar in just a few lines… but so far no deity came forward to show the light. lol getting ranty… it's funny, somehow the tech geekers all want regex to solve the problem. Regex, regex, regex, a 40 years old deviant bastard that by some twist of luck became a tool for matching text patterns. One bloke came forward to show-off a perl regex obfuscation. That's like, lol. But it might be good for the lulz if his code is actually complete and worked. Then, you have a few who'd nonchalantly remark “O, you just need push-down automata”. LOL, unless they show actual working code, its Automata their asses. folks, don't get angry with me. I'm a learner. I'm curious. I always am eager to learn. And there's always things we can learn. Don't get into a fit and do the troll dance in a pit with me. Nobody's gonna give a shit if you think u knew it all. If u are not the master of one thousand and one languages yet, you can learn with me. ☺ troll Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
i've just cleaned up my elisp code and wrote a short elisp tutorial. Here: 〈Emacs Lisp: Batch Script to Validate Matching Brackets〉 http://xahlee.org/emacs/elisp_validate_matching_brackets.html plain text version follows. Please let me know what you think. am still working on going thru all code in other langs. Will get to the ruby one, and that perl regex, and the other fixed python ones. (possibly also the 2 common lisp codes but am not sure they are runnable as is or just some non-working showoff. lol) === Emacs Lisp: Batch Script to Validate Matching Brackets Xah Lee, 2011-07-19 This page shows you how to write a elisp script that checks thousands of files for mismatched brackets. The Problem Summary I have 5 thousands files containing many matching pairs. I want to to know if any of them contains mismatched brackets. Detail The matching pairs includes these: () {} [] “” ‹› «» 〈〉 《》 【】 〖〗 「」 『』. The program should be able to check all files in a dir, and report any file that has mismatched bracket, and also indicate the line number or positon where a mismatch occurs. For those curious, if you want to know what these brackets are, see: • Syntax Design: Use of Unicode Matching Brackets as Specialized Delimiters • Intro to Chinese Punctuation with Computer Language Syntax Perspectives For other notes and conveniences about dealing with brackets in emacs, see: • Emacs: Defining Keys to Navigate Brackets • “extend-selection” at A Text Editor Feature: Extend Selection by Semantic Unit • “select-text-in-quote” at Suggestions on Emacs's mark-word Command Solution Here's outline of steps. • Go thru the file char by char, find a bracket char. • Check if the one on stack is a matching opening char. If so remove it. Else, push the current onto the stack. • Repeat the above till no more bracket char in the file. • If the stack is not empty, then the file got mismatched brackets. Report it. • Do the above on all files. Here's some interesting use of lisp features to implement the above. Define Matching Pair Chars as “alist” We begin by defining the chars we want to check, as a “association list” (aka “alist”). Like this: (setq matchPairs '( (( . )) ({ . }) ([ . ]) (“ . ”) (‹ . ›) (« . ») (【 . 】) (〖 . 〗) (〈 . 〉) (《 . 》) (「 . 」) (『 . 』) ) ) If you care only to check for curly quotes, you can remove elements above. This is convenient because some files necessarily have mismatched pairs such as the parenthesis, because that char is used for many non-bracketing purposes (e.g. ASCII smiley). A “alist” in lisp is basically a list of pairs (called key and value), with the ability to search for a key or a value. The first element of a pair is called its key, the second element is its value. Each pair is a “cons”, like this: (cons mykey myvalue), which can also be written using this syntax: (mykey . myvalue) for more easy reading. The purpose of lisp's “alist” is similar to Python's dictionary or Pretty Home Page's array. It is also similar to hashmap, except that alist can have duplicate keys, can search by values, maintains order, and alist is not intended for massive number of elements. Elisp has a hashmap datatype if you need that. (See: Emacs Lisp Tutorial: Hash Table.) (info (elisp) Association Lists) Generate Regex String from alist To search for a set of chars in emacs, we can read the buffer char-by- char, or, we can simply use “search-forward-regexp”. To use that, first we need to generate a regex string from our matchPairs alist. First, we defines/declare the string. Not a necessary step, but we do it for clarity. (setq searchRegex ) Then we go thru the matchPairs alist. For each pair, we use “car” and “cdr” to get the chars and “concat” it to the string. Like this: (mapc (lambda (mypair) (setq searchRegex (concat searchRegex (regexp-quote (car mypair)) | (regexp-quote (cdr mypair)) |) ) ) matchPairs) Then we remove the ending “|”. (setq searchRegex (substring searchRegex 0 -1)) ; remove the ending “|” Then, change | it to \\|. In elisp regex, the | is literal. The “regex or” is \|. And if you are using regex in elisp, elisp does not have a special regex string syntax, it only understands normal strings. So, to feed to regex \|, you need to espace the first backslash. So, your regex needs to have \\|. Here's how we do
Re: a little parsing challenge ☺
On Jul 18, 7:07 pm, Billy Mays no...@nohow.com wrote: On 7/18/2011 7:56 PM, Steven D'Aprano wrote: Billy Mays wrote: On 07/17/2011 03:47 AM, Xah Lee wrote: 2011-07-16 I gave it a shot. It doesn't do any of the Unicode delims, because let's face it, Unicode is for goobers. Goobers... that would be one of those new-fangled slang terms that the young kids today use to mean its opposite, like bad, wicked and sick, correct? I mention it only because some people might mistakenly interpret your words as a childish and feeble insult against the 98% of the world who want or need more than the 127 characters of ASCII, rather than understand you meant it as a sign of the utmost respect for the richness and diversity of human beings and their languages, cultures, maths and sciences. TL;DR version: international character sets are a problem, and Unicode is not the answer to that problem). As long as I have used python (which I admit has only been 3 years) Unicode has never appeared to be implemented correctly. I'm probably repeating old arguments here, but whatever. Unicode is a mess. When someone says ASCII, you know that they can only mean characters 0-127. When someone says Unicode, do the mean real Unicode (and is it 2 byte or 4 byte?) or UTF-32 or UTF-16 or UTF-8? When using the 'u' datatype with the array module, the docs don't even tell you if its 2 bytes wide or 4 bytes. Which is it? I'm sure that all the of these can be figured out, but the problem is now I have to ask every one of these questions whenever I want to use strings. Secondly, Python doesn't do Unicode exception handling correctly. (but I suspect that its a broader problem with languages) A good example of this is with UTF-8 where there are invalid code points ( such as 0xC0, 0xC1, 0xF5, 0xF6, 0xF7, 0xF8, ..., 0xFF, but you already knew that, as well as everyone else who wants to use strings for some reason). When embedding Python in a long running application where user input is received, it is very easy to make mistake which bring down the whole program. If any user string isn't properly try/excepted, a user could craft a malformed string which a UTF-8 decoder would choke on. Using ASCII (or whatever 8 bit encoding) doesn't have these problems since all codepoints are valid. Another (this must have been a good laugh amongst the UniDevs) 'feature' of unicode is the zero width space (UTF-8 code point 0xE2 0x80 0x8B). Any string can masquerade as any other string by placing few of these in a string. Any word filters you might have are now defeated by some cheesy Unicode nonsense character. Can you just just check for these characters and strip them out? Yes. Should you have to? I would say no. Does it get better? Of course! international character sets used for domain name encoding use yet a different scheme (Punycode). Are the following two domain names the same: tést.com , xn--tst-bma.com ? Who knows! I suppose I can gloss over the pains of using Unicode in C with every string needing to be an LPS since 0x00 is now a valid code point in UTF-8 (0x for 2 byte Unicode) or suffer the O(n) look up time to do strlen or concatenation operations. Can it get even better? Yep. We also now need to have a Byte order Mark (BOM) to determine the endianness of our characters. Are they little endian or big endian? (or perhaps one of the two possible middle endian encodings?) Who knows? String processing with unicode is unpleasant to say the least. I suppose that's what we get when we things are designed by committee. But Hey! The great thing about standards is that there are so many to choose from. -- Bill might check out my take 〈Xah's Unicode Tutorial〉 http://xahlee.org/Periodic_dosage_dir/unicode.html especially good for emacs users. if you grew up with english, unicode might seem complex or difficult due to unfamiliarity. but for asian people, when you dont have alphabets, it's kinda strange to think that a byte is char. The notion simply don't exist and impossible to establish. There are many encodings for chinese before unicode. Even today, unicode isn't used in taiwan or china. Taiwan uses big5, china uses GB18030, which contains all chars of unicode. ~8 years ago i thought that it'd be great if china adopted unicode sometimes in the future... so that we all just have one charset to deal with. But that's never gonna happen. On the contrary, am thinking now there's the possibility that the world adopts GB18030 someday. lol if you go to alexa.com for traffic ranking, a good percentage of the top few are chinese these days. more and more as i observed since mid 2000s. by the way, here's what these matching pairs are used for. ‹french quote› «french quote» the 〈〉 《》 are chinese brackets used for book titles etc. (CD, TV program, show title, etc.) the 「」 『』 are traditional chinese quotes, like english's ‘sinle curly’, “double
Re: a little parsing challenge ☺
On Sunday, July 17, 2011 2:48:42 AM UTC-7, Raymond Hettinger wrote: On Jul 17, 12:47 am, Xah Lee xah...@gmail.com wrote: i hope you'll participate. Just post solution here. Thanks. http://pastebin.com/7hU20NNL just installed py3. there seems to be a bug. in this file http://xahlee.org/p/time_machine/tm-ch04.html there's a mismatched double curly quote. at position 28319. the python code above doesn't seem to spot it? here's the elisp script output when run on that dir: Error file: c:/Users/h3/web/xahlee_org/p/time_machine/tm-ch04.html [“ 28319] Done deal! Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 18, 10:12 am, Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9...@myhashismyemail.com wrote: On 07/17/2011 03:47 AM,XahLee wrote: 2011-07-16 I gave it a shot. It doesn't do any of the Unicode delims, because let's face it, Unicode is for goobers. import sys, os pairs = {'}':'{', ')':'(', ']':'[', '':'', ':', '':''} valid = set( v for pair in pairs.items() for v in pair ) for dirpath, dirnames, filenames in os.walk(sys.argv[1]): for name in filenames: stack = [' '] with open(os.path.join(dirpath, name), 'rb') as f: chars = (c for line in f for c in line if c in valid) for c in chars: if c in pairs and stack[-1] == pairs[c]: stack.pop() else: stack.append(c) print (Good if len(stack) == 1 else Bad) + ': %s' % name -- Bill as Ian Kelly mentioned, your script fail because it doesn't report the position or line/column number of first mismatched bracket. This is rather significant part to this small problem. Avoiding unicode also lessen the value of this exercise, because handling unicode in python isn't trivial, at least with respect to this small exercise. I added other unicode brackets to your list of brackets, but it seems your code still fail to catch a file that has mismatched curly quotes. (e.g. http://xahlee.org/p/time_machine/tm-ch04.html ) LOL Billy. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 18, 2:59 pm, Thomas 'PointedEars' Lahn pointede...@web.de wrote: Ian Kelly wrote: Billy Mays wrote: I gave it a shot. It doesn't do any of the Unicode delims, because let's face it, Unicode is for goobers. Uh, okay... Your script also misses the requirement of outputting the index or row and column of the first mismatched bracket. Thanks to Python's expressiveness, this can be easily remedied (see below). I also do not follow Billy's comment about Unicode. Unicode and the fact that Python supports it *natively* cannot be appreciated enough in a globalized world. However, I have learned a lot about being pythonic from his posting (take those generator expressions, for example!), and the idea of looking at the top of a stack for reference is a really good one. Thank you, Billy! Here is my improvement of his code, which should fill the mentioned gaps. I have also reversed the order in the report line as I think it is more natural this way. I have tested the code superficially with a directory containing a single text file. Watch for word-wrap: # encoding: utf-8 ''' Created on 2011-07-18 @author: Thomas 'PointedEars' Lahn pointede...@web.de, based on an idea of Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9...@myhashismyemail.com in news:j01ph6$knt$1...@speranza.aioe.org ''' import sys, os pairs = {u'}': u'{', u')': u'(', u']': u'[', u'”': u'“', u'›': u'‹', u'»': u'«', u'】': u'【', u'〉': u'〈', u'》': u'《', u'」': u'「', u'』': u'『'} valid = set(v for pair in pairs.items() for v in pair) if __name__ == '__main__': for dirpath, dirnames, filenames in os.walk(sys.argv[1]): for name in filenames: stack = [' '] # you can use chardet etc. instead encoding = 'utf-8' with open(os.path.join(dirpath, name), 'r') as f: reported = False chars = ((c, line_no, col) for line_no, line in enumerate(f) for col, c in enumerate(line.decode(encoding)) if c in valid) for c, line_no, col in chars: if c in pairs: if stack[-1] == pairs[c]: stack.pop() else: if not reported: first_bad = (c, line_no + 1, col + 1) reported = True else: stack.append(c) print '%s: %s' % (name, (good if len(stack) == 1 else bad '%s' at %s:%s % first_bad)) Thanks for the fix. Though, it seems still wrong. On the file http://xahlee.org/p/time_machine/tm-ch04.html there is a mismatched curly double quote at 28319. the script reports: tm-ch04.html: bad ')' at 68:2 that doesn't seems right. Line 68 is empty. There's no opening or closing round bracket anywhere close. Nearest are lines 11 and 127. Maybe Billy Mays's algorithm is wrong. Xah (fairly discouraged now, after running 3 python scripts all failed) -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 17, 8:31 am, Thomas Jollans t...@jollybox.de wrote: On Jul 17, 9:47 am,XahLee xah...@gmail.com wrote: 2011-07-16 folks, this one will be interesting one. the problem is to write a script that can check a dir of text files (and all subdirs) and reports if a file has any mismatched matching brackets. • The files will be utf-8 encoded (unix style line ending). • If a file has mismatched matching-pairs, the script will display the file name, and the line number and column number of the first instance where a mismatched bracket occures. (or, just the char number instead (as in emacs's “point”)) • the matching pairs are all single unicode chars. They are these and nothing else: () {} [] “” ‹› «» 【】 〈〉 《》 「」 『』 Note that ‘single curly quote’ is not consider matching pair here. • You script must be standalone. Must not be using some parser tools. But can call lib that's part of standard distribution in your lang. Here's a example of mismatched bracket: ([)], (“[[”), ((, 】etc. (and yes, the brackets may be nested. There are usually text between these chars.) I'll be writing a emacs lisp solution and post in 2 days. Ι welcome other lang implementations. In particular, perl, python, php, ruby, tcl, lua, Haskell, Ocaml. I'll also be able to eval common lisp (clisp) and Scheme lisp (scsh), Java. Other lang such as Clojure, Scala, C, C++, or any others, are all welcome, but i won't be able to eval it. javascript implementation will be very interesting too, but please indicate which and where to install the command line version. I hope you'll find this a interesting “challenge”. This is a parsing problem. I haven't studied parsers except some Wikipedia reading, so my solution will probably be naive. I hope to see and learn from your solution too. i hope you'll participate. Just post solution here. Thanks. I thought I'd have some fun with multi-processing: https://gist.github.com/1087682 hi Thomas. I ran the program, all cpu went max (i have a quad), but after i think 3 minutes nothing happens, so i killed it. is there something special one should know to run the script? I'm using Python 3.2.1 on Windows 7. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 19, 10:33 am, Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9...@myhashismyemail.com wrote: On 07/19/2011 01:14 PM,XahLee wrote: I added other unicode brackets to your list of brackets, but it seems your code still fail to catch a file that has mismatched curly quotes. (e.g.http://xahlee.org/p/time_machine/tm-ch04.html ) LOL Billy. Xah I suspect its due to the file mode being opened with 'rb' mode. Also, the diction of characters at the top, the closing token is the key, while the opening one is the value. Not sure if thats obvious. Also returning the position of the first mismatched pair is somewhat ambiguous. File systems store files as streams of octets (mine do anyways) rather than as characters. When you ask for the position of the the first mismatched pair, do you mean the position as per file.tell() or do you mean the nth character in the utf-8 stream? Also, you may have answered this earlier but I'll ask again anyways: You ask for the first mismatched pair, Are you referring to the inner most mismatched, or the outermost? For example, suppose you have this file: foo[(])bar Would the ( be the first mismatched character or would the ]? yes i haven't been precise. Thanks for brining it up. thinking about it now, i think it's a bit hard to define precisely. My elisp code actually reports the “)”, so it's wrong too. LOL Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: a little parsing challenge ☺
On Jul 17, 12:47 am, Xah Lee xah...@gmail.com wrote: 2011-07-16 folks, this one will be interesting one. the problem is to write a script that can check a dir of text files (and all subdirs) and reports if a file has any mismatched matching brackets. … Ok, here's my solution (pasted at bottom). I haven't tried to make it elegant or terse, yet, seeing that many are already much elegent than i could possibly do so with my code. my solution basically use a stack. (i think all of us are doing similar) Here's the steps: • Go thru the file char by char, find a bracket char. • check if the one on stack is a matching opening char. If so remove it. Else, push the current onto the stack. • Repeat the above till end of file. • If the stack is not empty, then the file got mismatched brackets. Report it. • Do the above on all files. Many elegant solutions. Raymond Hettinger is very quick, posted a solution only after a hour or so when i posted it. Many others are very short, very nice. Thank you all for writing them. I haven't studied them yet. I'll run them all and post a summary in 2 days. (i have few thousands files to run this test thru, many of them have mismatched brackets. So i have good data to test with.) PS we still lack a perl, Scheme lisp, tcl, lua versions. These wouldn't be hard and would be interesting to read. If you are picking up one of these lang, this would be a good exercise. Haskell too. I particularly would like to see a javascript version ran from command line. Maybe somebody can put this exercise to Google folks ... they are like the js gods. also, now that we have these home-brewed code, how'd a parser expert do it? Is it possible to make it even simpler by using some parser tools? (have no idea what those lex yacc do, or modern incarnations) I've also been thinking whether this can be done with Parsing Expression Grammar. That would make the code semantics really elegant (as opposed home-cooked stack logic). Xah ;; -*- coding: utf-8 -*- ;; 2011-07-15, Xah Lee ;; go thru a file, check if all brackets are properly matched. ;; e.g. good: (…{…}… “…”…) ;; bad: ( [)] ;; bad: ( ( ) (setq inputDir ~/web/xahlee_org/p/) ; must end in slash (defvar matchPairs '() a alist. For each air, the car is opening char, cdr is closing char.) (setq matchPairs '( (( . )) ({ . }) ([ . ]) (“ . ”) (‹ . ›) (« . ») (【 . 】) (〈 . 〉) (《 . 》) (「 . 」) (『 . 』) ) ) (defvar searchRegex regex string of all pairs to search.) (setq searchRegex ) (mapc (lambda (mypair) (setq searchRegex (concat searchRegex (regexp-quote (car mypair)) | (regexp-quote (cdr mypair)) |) ) ) matchPairs) (setq searchRegex (replace-regexp-in-string |$ searchRegex t t)) ; remove the ending “|” (setq searchRegex (replace-regexp-in-string | \\| searchRegex t t)) ; change | to \\| for regex “or” operation (defun my-process-file (fpath) process the file at fullpath fpath ... (let (myBuffer (ii 0) myStack ξchar ξpos) (setq myStack '() ) ; each element is a vector [char position] (setq ξchar ) (setq myBuffer (get-buffer-create myTemp)) (set-buffer myBuffer) (insert-file-contents fpath nil nil nil t) (goto-char 1) (while (search-forward-regexp searchRegex nil t) (setq ξpos (point) ) (setq ξchar (buffer-substring-no-properties ξpos (- ξpos 1)) ) ;; (princ (format -\nfound char: %s \n ξchar) ) (let ((isClosingCharQ nil) (matchedOpeningChar nil) ) (setq isClosingCharQ (rassoc ξchar matchPairs)) (when isClosingCharQ (setq matchedOpeningChar (car isClosingCharQ) ) ) ;; (princ (format isClosingCharQ is: %s\n isClosingCharQ) ) ;; (princ (format matchedOpeningChar is: %s\n matchedOpeningChar) ) (if (and (car myStack) ; not empty (equal (elt (car myStack) 0) matchedOpeningChar ) ) (progn ;; (princ (format matched this bottom item on stack: %s \n (car myStack)) ) (setq myStack (cdr myStack) ) ) (progn ;; (princ (format did not match this bottom item on stack: %s\n (car myStack)) ) (setq myStack (cons (vector ξchar ξpos) myStack) ) ) ) ) ;; (princ current stack: ) ;; (princ myStack ) ;; (terpri ) ) (when (not (equal myStack nil)) (princ Error file: ) (princ fpath) (print (car myStack) ) ) (kill-buffer myBuffer) )) ;; (require 'find-lisp) (let (outputBuffer) (setq outputBuffer *xah match pair output* ) (with-output-to-temp-buffer outputBuffer (mapc 'my-process-file (find-lisp-find-files inputDir \\.html$)) (princ Done deal!) ) ) -- http
a little parsing challenge ☺
2011-07-16 folks, this one will be interesting one. the problem is to write a script that can check a dir of text files (and all subdirs) and reports if a file has any mismatched matching brackets. • The files will be utf-8 encoded (unix style line ending). • If a file has mismatched matching-pairs, the script will display the file name, and the line number and column number of the first instance where a mismatched bracket occures. (or, just the char number instead (as in emacs's “point”)) • the matching pairs are all single unicode chars. They are these and nothing else: () {} [] “” ‹› «» 【】 〈〉 《》 「」 『』 Note that ‘single curly quote’ is not consider matching pair here. • You script must be standalone. Must not be using some parser tools. But can call lib that's part of standard distribution in your lang. Here's a example of mismatched bracket: ([)], (“[[”), ((, 】etc. (and yes, the brackets may be nested. There are usually text between these chars.) I'll be writing a emacs lisp solution and post in 2 days. Ι welcome other lang implementations. In particular, perl, python, php, ruby, tcl, lua, Haskell, Ocaml. I'll also be able to eval common lisp (clisp) and Scheme lisp (scsh), Java. Other lang such as Clojure, Scala, C, C++, or any others, are all welcome, but i won't be able to eval it. javascript implementation will be very interesting too, but please indicate which and where to install the command line version. I hope you'll find this a interesting “challenge”. This is a parsing problem. I haven't studied parsers except some Wikipedia reading, so my solution will probably be naive. I hope to see and learn from your solution too. i hope you'll participate. Just post solution here. Thanks. Xah -- http://mail.python.org/mailman/listinfo/python-list
What Programing Language are the Largest Website Written In?
maybe this will be of interest. 〈What Programing Language Are the Largest Website Written In?〉 http://xahlee.org/comp/website_lang_popularity.html - i don't remember how, but today i suddenly got reminded that Facebook is written in PHP. So, on the spur of the moment, i twitted: “Remember folks, the world's largest sites {Facebook, Wikipedia, “Yahoo!”, etc} are written in Pretty Home Page!” and followed with: “To Chinese friends, what's Baido, QQ, Taobao, Sina written in?” Then, this question piqued me, even i tried to not waste my time. But it overpowered me before i resisted, becuase i quickly spend 15 min to write this list (with help of Google): 1 Google ◇ Java 2 Facebook ◇ PHP 3 YouTube ◇ Python 4 Yahoo! ◇ PHP 5 blogger.com ◇ Java 6 baidu.com ◇ C/C++. perl/python/ruby 7 Wikipedia ◇ PHP 8 Windows Live live.com 9 Twitter.com ◇ Scala and Ruby? 10 QQ.com ◇ ? 11 MSN.com ◇ ? 13 LinkedIn ◇ PHP? 15 TaoBao.com ◇ ? 16 sina.com.cn ◇ ? 17 Amazon.com ◇ ? 18 WordPress.com ◇ PHP 22 eBay.com ◇ ? 23 yandex.ru (Russian) ◇ ? 24 Bing ◇ ? 27 Microsoft.com ◇ ? 28 网易 163.com ◇ ? 29 PayPal.com ◇ Java? 31 新浪微博 weibo.com ◇ ? 32 Flickr.com ◇ ? 34 mail.ru ◇ ? 35 Craiglist.org ◇ perl 36 FC2.com ◇ ? 38 Apple.com ◇ Objective J? 39 imdb.com ◇ ? 41 VKontakte.ru ◇ ? 43 搜狐网 sohu.com ◇ ? 44 Ask.com ◇ ? 45 BBC.co.uk ◇ ? 46 tumblr.com ◇ PHP 47 LiveJasmin.com (porn) ◇ ? 48 xvideos.com (porn) ◇ ? … 56 土豆网 Todou.com ◇ ? 81 YouPorn.com ◇ ? StumbleUpon.com ◇ PHP, Perl, C++ … the numbers is site ranking, from alexa.com. (missing ones are mostly duplicates, such as google japan, google india, etc.) i think notable interest is that twitter stands out, with Scala and Ruby. Those with perl are probably going back to the first dot com era (aka Web 1.0, ~1995 to ~2002). At that time, perl was basically the only game in town (secondarily: Java). (i don't recall what amazon and ebay were in... was it perl or php? how about imdb.com?) most php follows starting in early 2000s, that's when PHP quietly surpassed perl in all battle fronts. it'd be interesting to know what some of the chinese sites uses, and porn sites (e.g. livejasmin, xvideos, youporn) as for Microsoft sites... are they in C/C++ and or dotnet? Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Lisp refactoring puzzle
2011-07-11 On Jul 11, 6:51 am, jvt vincent.to...@gmail.com wrote: I might as well toss my two cents in here. Xah, I don't believe that the functional programming idiom demands that we construct our entire program out of compositions and other combinators without ever naming anything. That is much more the province of so-called function- level programming languages like APL/J and to a more limited extent concatenative languages where data (but not code) is mostly left without names. Functional programming, in my mind, is about identifying reproducibly useful abstractions, _naming them_, and constructing other abstractions from them. Your piece of code above probably needs to be factored out into named pieces so that the composition is more sensible. If a piece of code isn't comprehensible, it might be because it isn't using the right abstractions in the right way, not because the notion of functional programming is itself problematic. One might instead provide a nightmare nest of procedural code and claim that procedural programming has problems. Of course, this particular kind of problem might be less common in procedural code, since it depends heavily on naming and side effecting values, but it isn't hard to find procedural code with a long list of operations and namings wherein the chose names are random or otherwise unrelated to the problem domain. My adviser in grad school used to name variables after pieces of furniture in dutch, but that didn't cause me to impeach the _notion_ of procedural code. hi jvt, of course, you are right. But i wasn't criticising functional programing in anyway. was just putting out my tale as a caution, to those of us — e.g. academic scheme lispers and haskell types — who are perpetually mangling their code for the ultimate elegant constructs. but speaking on this now... as you guys may know, i was a naive master of Mathematica while being absolute illiterate in computer science or any other lang. (see 〈Xah Lee's Computing Experience (Impression Of Lisp from Mathematica)〉 @ http://xahlee.org/PageTwo_dir/Personal_dir/xah_comp_exp.html ) When i didn't know anything about lisp, i thought lisp would be similar, or even better, as a highlevel lang in comparison to Mathematica. In retrospect now, i was totally wrong. lisp, or scheme lisp, is a magnitude more highlevel in comparison to C or C derivatives such as C++, Java. However, in comparison to Mathematica, it's one magnitude low level. (it pains me to see lisp experts here talking about cons and macros all day, even bigshot names such as one Paul Graham and in lisp books praising lisp macros. Quite ridiculous.) over the years, i had curiosity whether perhaps ML/OCaml, Haskell, would be equivalent high-level as Mathematica as i thought. Unfortunately, my study of them didn't went far. (best result is my incomplete 〈OCaml Tutorial〉 @ http://xahlee.org/ocaml/ocaml.html ) Am not qualified to comment on this, but i think that even Haskell, OCaml, are still quite low in comparison to Mathematica. it's funny, in all these supposedly modern high-level langs, they don't provide even simple list manipulation functions such as union, intersection, and the like. Not in perl, not in python, not in lisps. (sure, lib exists, but it's a ride in the wild) It's really exceedingly curious to me. And it seems that lang authors or its users, have all sorts of execuse or debate about whether those should be builtin if you force them to answer. (i.e. they don't get it) While, we see here regularly questions about implementing union etc with follow up of wild answers and re-invention the thousandth time. Of course, Mathematica has Union, Intersection, and a host of others some 20 years ago, and today it has a complete set of combinatorics functions as *builtin* functions (as opposed to add-on libs of second- rate quality). (this is not a question. No need to suggest some possible reasons why lang might not want to have a whole set of list manipulation builtin. You (the lisper/python/perl regulars and other lang fans) are a complete idiot, that's what i'm saying. COMPLETE IDIOT. (actually, this is not surprising, since genius and true thinkers are rare and few. (such as myself. As they say, beyond the times))) i also wondered, if Mathematica is truely a magnitude higher level than lisp, why we don't see any computer scientists talk about it? (of course there are, but almost non-existant in comparison to, say, academic publications on Scheme, Haskell, even Java) I think the reason is social, again. Proprietary langs isn't a revue of academicians, together with the fact that Stephen Wolfram goes about as if the entire science of computer science comprises of himself. Want Mathematica? Pay $2k+. recently i spent several days studying and watching a talk by Douglas Crockford. it is incredible. He went thru history, and explained, how it is the very people in computing community who laughed and stifled all the
Re: emacs lisp text processing example (html5 figure/figcaption)
On Jul 4, 12:13 pm, S.Mandl stefanma...@web.de wrote: Nice. I guess that XSLT would be another (the official) approach for such a task. Is there an XSLT-engine for Emacs? -- Stefan haven't used XSLT, and don't know if there's one in emacs... it'd be nice if someone actually give a example... Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: emacs lisp text processing example (html5 figure/figcaption)
On Jul 5, 12:17 pm, Ian Kelly ian.g.ke...@gmail.com wrote: On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee xah...@gmail.com wrote: So, a solution by regex is out. Actually, none of the complications you listed appear to exclude regexes. Here's a possible (untested) solution: div class=img ((?:\s*img src=[^.]+\.(?:jpg|png|gif) alt=[^]+ width=[0-9]+ height=[0-9]+)+) \s*p class=cpt((?:[^]|(?!/p))+)/p \s*/div and corresponding replacement string: figure \1 figcaption\2/figcaption /figure I don't know what dialect Emacs uses for regexes; the above is the Python re dialect. I assume it is translatable. If not, then the above should at least work with other editors, such as Komodo's Find/Replace in Files command. I kept the line breaks here for readability, but for completeness they should be stripped out of the final regex. The possibility of nested HTML in the caption is allowed for by using a negative look-ahead assertion to accept any tag except a closing /p. It would break if you had nested p tags, but then that would be invalid html anyway. Cheers, Ian that's fantastic. Thanks! I'll try it out. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: emacs lisp text processing example (html5 figure/figcaption)
On Jul 5, 12:17 pm, Ian Kelly ian.g.ke...@gmail.com wrote: On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee xah...@gmail.com wrote: So, a solution by regex is out. Actually, none of the complications you listed appear to exclude regexes. Here's a possible (untested) solution: div class=img ((?:\s*img src=[^.]+\.(?:jpg|png|gif) alt=[^]+ width=[0-9]+ height=[0-9]+)+) \s*p class=cpt((?:[^]|(?!/p))+)/p \s*/div and corresponding replacement string: figure \1 figcaption\2/figcaption /figure I don't know what dialect Emacs uses for regexes; the above is the Python re dialect. I assume it is translatable. If not, then the above should at least work with other editors, such as Komodo's Find/Replace in Files command. I kept the line breaks here for readability, but for completeness they should be stripped out of the final regex. The possibility of nested HTML in the caption is allowed for by using a negative look-ahead assertion to accept any tag except a closing /p. It would break if you had nested p tags, but then that would be invalid html anyway. Cheers, Ian emacs regex supports shygroup (the 「(?:…)」) but it doesn't support the negative assertion 「?!…」 though. but in anycase, i can't see how this part would work p class=cpt((?:[^]|(?!/p))+)/p ? Xah -- http://mail.python.org/mailman/listinfo/python-list
emacs lisp text processing example (html5 figure/figcaption)
OMG, emacs lisp beats perl/python again! Hiya all, another little emacs lisp tutorial from the tiny Xah's Edu Corner. 〈Emacs Lisp: Processing HTML: Transform Tags to HTML5 “figure” and “figcaption” Tags〉 xahlee.org/emacs/elisp_batch_html5_tag_transform.html plain text version follows. -- Emacs Lisp: Processing HTML: Transform Tags to HTML5 “figure” and “figcaption” Tags Xah Lee, 2011-07-03 Another triumph of using elisp for text processing over perl/python. The Problem -- Summary I want batch transform the image tags in 5 thousand html files to use HTML5's new “figure” and “figcaption” tags. I want to be able to view each change interactively, while optionally give it a “go ahead” to do the whole job in batch. Interactive eye-ball verification on many cases lets me be reasonably sure the transform is done correctly. Yet i don't want to spend days to think/write/test a mathematically correct program that otherwise can be finished in 30 min with human interaction. -- Detail HTML5 has the following new tags: “figure” and “figcaption”. They are used like this: figure img src=cat.jpg alt=my cat width=167 height=106 figcaptionmy cat!/figcaption /figure (For detail, see: HTML5 “figure” & “figurecaption” Tags Browser Support) On my website, i used a similar structure. They look like this: div class=img img src=cat.jpg alt=my cat width=167 height=106 p class=cptmy cat!/p /div So, i want to replace them with the HTML5's new tags. This can be done with a regex. Here's the “find” regex: div class=img ?img src=\([^.]+?\)\.jpg alt=\([^]+?\) width=\([0-9]+?\) height=\([0-9]+?\)? p class=cpt\([^]+?\)/p ?/div Here's the replacement string: figure img src=\1.jpg alt=\2 width=\3 height=\4 figcaption\5/figcaption /figure Then, you can use “find-file” and dired's “dired-do-query-replace- regexp” to work on your 5 thousand pages. Nice. (See: Emacs: Interactively Find & Replace String Patterns on Multiple Files.) However, the problem here is more complicated. The image file may be jpg or png or gif. Also, there may be more than one image per group. Also, the caption part may also contain complicated html. Here's some examples: div class=img img src=cat1.jpg alt=my cat width=200 height=200 img src=cat2.jpg alt=my cat width=200 height=200 p class=cptmy 2 cats/p /div div class=img img src=jamie_cat.jpg alt=jamie's cat width=167 height=106 p class=cptjamie's cat! Her blog is a href=http://example.com/ jamie/http://example.com/jamie//a/p /div So, a solution by regex is out. Solution The solution is pretty simple. Here's the major steps: Use “find-lisp-find-files” to traverse a dir. For each file, open it. Search for the string div class=img Use “sgml-skip-tag-forward” to jump to its closing tag. Save the positions of these tag begin/end positions. Ask user if she wants to replace. If so, do it. (using “delete- region” and “insert”) Repeat. Here's the code: ;; -*- coding: utf-8 -*- ;; 2011-07-03 ;; replace image tags to use html5's “figure” and “figcaption” tags. ;; Example. This: ;; div class=img…/div ;; should become this ;; figure…/figure ;; do this for all files in a dir. ;; rough steps: ;; find the div class=img ;; use sgml-skip-tag-forward to move to the ending tag. ;; save their positions. (defun my-process-file (fpath) process the file at fullpath FPATH ... (let (mybuff p1 p2 p3 p4 ) (setq mybuff (find-file fpath)) (widen) (goto-char 0) ;; in case buffer already open (while (search-forward div class=\img\ nil t) (progn (setq p2 (point) ) (backward-char 17) ; beginning of “div” tag (setq p1 (point) ) (forward-char 1) (sgml-skip-tag-forward 1) ; move to the closing tag (setq p4 (point) ) (backward-char 6) ; beginning of the closing div tag (setq p3 (point) ) (narrow-to-region p1 p4) (when (y-or-n-p replace?) (progn (delete-region p3 p4 ) (goto-char p3) (insert /figure) (delete-region p1 p2 ) (goto-char p1) (insert figure) (widen) ) ) ) ) (when (not (buffer-modified-p mybuff)) (kill-buffer mybuff) ) ) ) (require 'find-lisp) (let (outputBuffer) (setq outputBuffer *xah img/figure replace output* ) (with-output-to-temp-buffer outputBuffer (mapc 'my-process-file (find-lisp-find-files ~/web/xahlee_org/ emacs/ \\.html$)) (princ Done deal!) ) ) Seems pretty simple right? The “p1” and “p2” variables are the positions of start/end of div class=img. The “p3” and “p4” is the start/end of it's closing tag / div. We also used a little trick with “widen” and “narrow-to-region”. It lets me see just the part that i'm interested. It narrows to the beginning/end of the div.img. This makes eye-balling a bit easier. The real time
what is the airspeed velocity of an unladen swallow
this will be of interest to those bleeding-edge pythoners. “what… is the airspeed velocity of an unladen swallow?” xahlee.org/funny/unladen_swallow.html Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
On Jun 18, 4:06 am, Dotan Cohen dotanco...@gmail.com wrote: On Sat, Jun 18, 2011 at 01:09, Xah Lee xah...@gmail.com wrote: thanks. didn't know about Ducky keyboard. Looks good. Also nice to hear your experience about Truly Ergonomic keyboard. I like it, see my first-hour review here:http://geekhack.org/showwiki.php?title=Island:18154 very nice review! and on geekhack.org too — the hardcore keyboard mod site! I enjoyed reading it. no actually i don't know how to make normal letter keys as (ctrl, alt) modifiers. You'll need a usb hid remapper. (there's a couple for mac os x i linked on my site but i couldn't verify cuz am now on a 6 years old powerpc with outdated mac os x) For Windows, Microsoft made a layout maker. I haven't used it so i don't know if it allows mapping letter keys as modifier. Have you tried it? I use Kubuntu Linux. i only started to use linux this month, from 10 years hiatus. First thing to do there is remap keys to the way i like of course. But am not familiar on how-to there. Seems xmodmap is becoming obsolete and XKB is in place. There's a couple nice sites about XKB but havn't had a chance to study them yet. May i ask you a few questions down the road? (maybe we can add each other on google talk or some social network) (off to email) Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
On Jun 14, 7:50 am, Dotan Cohen dotanco...@gmail.com wrote: On Mon, Jun 13, 2011 at 10:21, Elena egarr...@gmail.com wrote: On 13 Giu, 06:30, Tim Roberts t...@probo.com wrote: Studies have shown that even a strictly alphabetical layout works perfectly well, once the typist is acclimated. Once the user is acclimated to move her hands much more (about 40% more for Qwerty versus Dvorak), that is. And disproportionate usage of fingers. On QWERTY the weakest fingers (pinkies) do almost 1/4 of the keypresses when modifier keys, enter, tab, and backspace are taken into account. I'm developing a QWERTY-based layout that moves the load off the pinkies and onto the index fingers:http://dotancohen.com/eng/noah_ergonomic_keyboard_layout.html There is a Colemak version in the works as well. u r aware that there are already tens of layouts, each created by programer, thinking that they can create the best layout? if not, check 〈Computer Keyboards, Layouts, Hotkeys, Macros, RSI ⌨〉 xahlee.org/Periodic_dosage_dir/keyboarding.html on layout section. Lots people all creating layouts. also, you want to put {Enter, Tab}, etc keys in the middle, but I don't understand from ur website how u gonna do that since it requires keyboard hardware modification. e.g. r u creating key layout on PC keyboard or are you creating hardware keyboard Key layout? The former is a dime a million, the latter is rare but also there are several sites all trying to do it. Talk is cheap, the hardest part is actually to get money to finance and manufacture it. The latest one, which i deem good, is Truely Ergonomic keyboard. It sells for $200 and is in pre-order only now. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
On Jun 15, 5:43 am, rusi rustompm...@gmail.com wrote: On Jun 15, 5:32 pm, Dotan Cohen dotanco...@gmail.com wrote: Thanks. From testing small movements with my fingers I see that the fourth finger is in fact a bit weaker than the last finger, but more importantly, it is much less dexterous. Good to know! Most of the piano technique-icians emphasis, especially those of the last century like Hanon, was to cultivate 'independence' of the fingers. The main target of these attacks being the 4th finger. The number of potential-pianists who ruined their hands and lives chasing this holy grail is unknown Hi rusi, am afaid going to contradict what u say here. i pretty much mastered Hanon 60. All of it, but it was now 8 years ago. The idea that pinky is stronger than 4th is silly. I can't fathom any logic or science to support that. Perhaps what u meant is that in many situations the use of pinky can be worked around because it in at the edge of your hand so you can apply chopping motion or similar. (which, is BAD if you want to develope piano finger skill) However, that's entirely different than saying pinky being stronger than 4th. there's many ways we can cookup tests right away to see. e.g. try to squeeze a rubber ball with 4th and thumb. Repeat with pink + thumb. Or, reverse exercise by stretching a rubber band wrapped on the 2 fingers of interest. You can easy see that pinky isn't stronger. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
On Jun 17, 2:26 pm, Dotan Cohen dotanco...@gmail.com wrote: On Fri, Jun 17, 2011 at 20:43, Xah Lee xah...@gmail.com wrote: u r aware that there are already tens of layouts, each created by programer, thinking that they can create the best layout? Yes. Mine is better :) Had Stallman not heard of VI when he set out to write Emacs? if not, check 〈Computer Keyboards, Layouts, Hotkeys, Macros, RSI ⌨〉 xahlee.org/Periodic_dosage_dir/keyboarding.html on layout section. Lots people all creating layouts. also, you want to put {Enter, Tab}, etc keys in the middle, but I don't understand from ur website how u gonna do that since it requires keyboard hardware modification. e.g. r u creating key layout on PC keyboard or are you creating hardware keyboard Key layout? The former is a dime a million, the latter is rare but also there are several sites all trying to do it. Talk is cheap, the hardest part is actually to get money to finance and manufacture it. The latest one, which i deem good, is Truely Ergonomic keyboard. It sells for $200 and is in pre-order only now. I ordered the Truley Ergonomic keyboard, I waited for half a year after delivery was supposed to happen to request my money back. Too many delays, so in the end I bought a Ducky mechanical (Cherry Browns) instead. I am writing a software keyboard layout. I'm actually having a hard time moving the modifier keys (Alt, Ctrl) to a new location. If you know how to do that I would much appreciate some advice, I'll post the problem here or in private mail. Thanks, Lee. (or should that be Thanks, Xah?) thanks. didn't know about Ducky keyboard. Looks good. Also nice to hear your experience about Truly Ergonomic keyboard. no actually i don't know how to make normal letter keys as (ctrl, alt) modifiers. You'll need a usb hid remapper. (there's a couple for mac os x i linked on my site but i couldn't verify cuz am now on a 6 years old powerpc with outdated mac os x) For Windows, Microsoft made a layout maker. I haven't used it so i don't know if it allows mapping letter keys as modifier. Have you tried it? i don't know much about the subject but from what i read am guessing it's possible, because each key just send up/down signals. (whether you are using usb or ps/2 makes a difference too.) (am assumbing above that you want to put modifiers in normal letter key positions. But if all you want to do is swap modifier among themselves, that's pretty easy. Lots of tools to do that for mac and windows.) But even if you succeded in putting modifiers to letter key positions, you may run into problems with key ghosting, because the circuits are desigend to prevent ghosting on qwerty layout only (with mod keys in their normal positions). Unless your keyboard is actually full n-key- roll-over. maybe some of these are useful info, but maybe you are quite beyond that. Thanks for your info too. Good luck. just Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
On Jun 13, 6:45 pm, Gregory Ewing greg.ew...@canterbury.ac.nz wrote: Chris Angelico wrote: And did any of the studies take into account the fact that a lot of computer users - in all but the purest data entry tasks - will use a mouse as well as a keyboard? What I think's really stupid is designing keyboards with two big blocks of keys between the alphabetic keys and the mouse. Back when standard-grade keyboards didn't usually have a built-in numeric keypad, it was much easier to move one's right hand back and forth between the keyboard and mouse. Nowadays I find myself perpetually prone to off-by-one errors when moving back to the keyboard. :-( numerical keypad is useful to many. Most people can't touch type. Even for touch typist, many doesn't do the number keys. So, when they need to type credit, phone number, etc, they go for the number pad. Also, i think the number pad esentially have become a calculator for vast majority of computer users. These days, almost all keyboard from Microsoft or Logitech has a Calculator button near the number pad to launch it. i myself, am a qwerty typist since ~1987, also worked as data entry clerk for a couple of years. Am a dvorak touch typist since 1994. (and emacs since 1997) However, i never learned touch type the numbers on the main section till i think ~2005. Since about 2008, the numerical keypad is used as extra function keys. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
On Jun 13, 6:19 am, Steven D'Aprano 〔steve +comp.lang.pyt...@pearwood.info〕 wrote: │ I don't know if there are any studies that indicate how much of a │ programmer's work is actual mechanical typing but I'd be surprised if it │ were as much as 20% of the work day. The rest of the time being thinking, │ planning, debugging, communicating with customers or managers, reading │ documentation, testing, committing code, sketching data schemas on the │ whiteboard ... to say nothing of the dreaded strategy meetings. you can find the study on my site. URL in the first post of this thread. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs. Colemak: is it Worthwhile to Improve the Dvorak Layout?
Ba Wha 13, 7:23 nz, Ehfgbz Zbql 〔ehfgbzcz...@tznvy.pbz〕 jebgr: │ Qibenx -- yvxr djregl naq nal bgure xrlobneq ynlbhg -- nffhzrf gur │ pbzchgre vf n glcrjevgre. │ Guvf zrnaf va rssrpg ng yrnfg gjb pbafgenvagf, arprffnel sbe gur │ glcrjevgre ohg abg sbe gur pbzchgre: │ │ n. Gur glcvfg pna glcr bayl 1 xrl ng n gvzr │ o. Bar (xrl)fgebxr trarengrf rknpgyl 1 yrggre │ │ Rkprcgvbaf gb [n] ner Fuvsg (Pgey) rgp ohg pyrneyl va ehaavat hfr gurl │ ner gur rkprcgvba abg gur ehyr. │ │ │ Jurer fcrrq ernyyl vf ivgny, fhpu nf sbe pbheg fgrabtencuref, fcrpvny zrpunavpny │ │ fubegunaq znpuvarf fhpu nf fgrabglcrf ner hfrq, pbfgvat gubhfnaqf bs qbyynef ohg nyybjvat │ │ gur glcvfg gb ernpu fcrrqf bs bire 300 jcz. │ │ Lrf, vafgehzragf yvxr fgrabglcrf fcrrq hc glcvat ol hanffhzvat [n] │ Yvxrjvfr cvnavfgf pna or fnvq (naq frra) gb qb zber ng gur cvnab guna │ glcvfgf ng n pbzchgre orpnhfr pubeqf ner cneg bs gur 'nyybjrq │ ynathntr'. │ │ Nffhzcgvba [o] yvxrjvfr vf haarprffnevyl erfgevpgvir ba n pbzchgre. │ Guvax bs nyy gur 'nooeri/favccrg/fubegsbez/grzcyngr' flfgrzf yvxr │ lnfavccrg, grkgzngr-favccrgf, rznpf/iv nooerif rgp. │ │ Sbe beqvanel Ratyvfu gurer ner guvatf yvxr xrlfpevcguggc://jjj.serrjrof.pbz/pnfflwnarx │ │ Sbe rknzcyr gur zbfg pbzzba jbeqf (rfgvzngrq gb or nebhaq 40% bs │ Ratyvfu) ner fubegsbezrq nf: │ o = ohg │ p = jvgu │ q = unq │ r = guvf │ s = bs │ t = gung │ u = gur │ w = juvpu │ a = naq │ ...rgp rgp hcgb │ m = jnf │ │ gura pbzzba cuenfrf │ noyr gb = po │ unq orra = qa │ qb abg = qk │ qvq abg = rk │ qbrf abg = qfk │ │ rgp │ │ Pyrneyl, sbe cebtenzzref guvf vf hayvxryl gb or zhpu hfr -- │ cebtenzzvat ynathntrf ner abg Ratyvfu. │ │ Ohg ohg vg vf pregnvayl na bcra dhrfgvba jurgure vs gur ercrngvat │ cnggreaf va cebtenzzvat ynathntrf ner pncgherq vagb fbzr flfgrz, gur │ erfhygvat orarsvg jbhyq or n zrer zvpeb-bcgvzvmngvba be fbzrguvat zber │ fvtavsvpnag. V unir frra fbzr tbbq cebtenzzref fjrne ol │ rznpf-lnfavccrgf, grkgzngr-favccrgf rgp. gurer'f fcrpvny vachg qrivprf qrfvtarq sbe pubeqvat, pnyyrq pubeqvat xrlobneq. Gurer'f qngnunaq. Ybbx hc Jvxvcrqvn sbe n yvfg. gurer'f nyfb xvarfvf naq bguref gung jbexf jvgu sbbg crqnyf. Fb, vg'f yvxr pubeqvat jvgu lbhe srrg gbb. Rire frra gubfr penml betnavfg jvgu srrg ohfl ba 30 crqnyf? unir lbh gevrq ibvpr vachg? Jvaqbjf pbzrf jvgu vg. Cerggl tbbq. Gubhtu, qbrfa'g jbex fb jryy jvgu nccf vzcyrzragrq bhgfvqr bs ZF'f senzrjbex, fhpu nf rznpf. fbzr cebtenzre'f fbyhgvbaf: 〈Pryroevgl Cebtenzref jvgu EFV (Ercrgvgvir Fgenva Vawhel)〉 uggc://knuyrr.bet/rznpf/rznpf_unaq_cnva_pryroevgl.ugzy Knu -- http://mail.python.org/mailman/listinfo/python-list
Re: Keyboard Layout: Dvorak vs. Colemak: is it Worthwhile to Improve the Dvorak Layout?
for some reason, was unable to post the previous message. (but can post others) So, the message is rot13'd and it works. Not sure what's up with Google groups. (this happened a few years back once. Apparantly, the message content might have something to do with it because rot13 clearly works. Yet, the problem doesnt seem to be my name or embedded url, since it only happens with the previous message) -- http://mail.python.org/mailman/listinfo/python-list
Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?
(a lil weekend distraction from comp lang!) in recent years, there came this Colemak layout. The guy who created it, Colemak, has a site, and aggressively market his layout. It's in linuxes distro by default, and has become somewhat popular. I remember first discovering it perhaps in 2007. Me, being a Dvorak typist since 1994, am curious on what he has to say about comparison. I recall, i was offended seeing how he paints a bias in peddling his creation. So, here, let me repaint his bias. Here it is, and judge for yourself. 〈Keyboard Layout: Dvorak vs Colemak: is it Worthwhile to Improve the Dvorak Layout?〉 http://xahlee.org/kbd/dvorak_vs_colemak.html here's a interesting excerpt: Just How Much Do You Type? Many programers all claim to type 8 or 10 hours a day. They may be sitting in front of the computer all day, but the time their fingers actually dance on keyboard is probably less than 1 hour per day. Contrast data-entry clerks. They are the real typists. Their fingers actually type, continuously, for perhaps 6 hours per day. It is important get a sense of how much you actually type. This you can do by logging you keystrokes using a software. Let's assume a pro typist sustain at 60 wpm. 60 wpm is 300 strokes per min, or 18k per hour. Suppose she works 8 hours a day, and assume just 3 hours actually typing. 18k × 3 = 54k chars per day. With this figure, you can get a sense of how many “hours” you actually type per day. I sit in front of computer on average 13 hours per day for the past several years. I program and write several blogs. My actual typing is probably double or triple of average day-job programers. From my emacs command frequency log for 6 months in 2008, it seems i only type 17k strokes per day. That's 31% of the data-entry clerk scenario above. Or, i only type ONE hour a day! I was quite surprised how low my own figure is. But thinking about it… it make sense. Even we sit in front of computer all day, but the actual typing is probably some miniscule percentage of that. Most of the time, you have to chat, lunch, run errands, browse web, read docs, run to the bathroom. Perhaps only half of your work time is active coding or writing (emails; docs). Of that duration, perhaps majority of time you are digesting the info on screen. Your whole day's typing probably can be done in less than 20 minutes if you just type continuously. If your typing doesn't come anywhere close to a data-entry clerk, then any layout “more efficient” than Dvorak is practically meaningless. Xah -- http://mail.python.org/mailman/listinfo/python-list
uhmm... your chance to spit on me
Dear lisp comrades, it's Friday! Dear Xah, your writing is: • Full of bad grammar. River of Hiccups. • Stilted. Chocked under useless structure and logic. • WRONG — Filled with uncouth advices. • Needlessly insulting. You have problems. • Simply stinks. Worthless. • Mediocre. Just like everybody, admit it. • I love it. • Your writing is pro! • you are genius! one of the great expositor, eassyist. • Dude, you are full of shit. I've not seen a crank quite like you. Vote at: http://xahlee.blogspot.com/2011/06/xahs-writing-is.html. Xah (i code python small time too) -- http://mail.python.org/mailman/listinfo/python-list
Re: English Idiom in Unix: Directory Recursively
On May 26, 4:20 am, Thorsten Kampe thors...@thorstenkampe.de wrote: Did your mom tell you to recursively clean up your room?. that had me L O L! i think i'll quote in my unix hating blogs sometimes, if you don't mind. ☺ Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: English Idiom in Unix: Directory Recursively
On May 24, 3:06 pm, Rikishi42 skunkwo...@rikishi42.net wrote: On 2011-05-24, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: I think that is a patronizing remark that under-estimates the intelligence of lay people and over-estimates the difficulty of understanding recursion. Why would you presume this to be related to intelligence? The point was not about being *able* to understand, but about *needing* to understand in order to use. Maybe they don't need to understand recursion. So what? I think you should read the earlier posts again, this is drifting so far from what I intended. What I mean is: I'm certain that over the years I've had more than one person come to me and ask what 'Do you wish to delete this directory recursively?' meant. BAut never have I been asked to explain what 'Do you wish to delete this directory and it's subdirs/with all it's contents?' meant. Never. Recursion is a perfectly good English word, no more technical than accelerate or incinerate or dissolve or combustion. Do people need to know the word combustion when they could say burn instead? It wasn't about the word, but about the nature of the function. Besides, if the chance exists of a confusion between a recursive job and the fact the job is done using a recursive function... I would try staying away from the expression. Why not use 'delete a directory'. It's obvious the content gets binned, too. Do you know many people who incinerate leaves and branches in their garden? I burn them. Do they need to know the words microwave oven when they could be saying invisible rays cooking thing? The word oven has existed for ages, microwave is just a name for the type of oven. Not even a description, just a name. I wonder whether physicists insist that cars should have a go faster pedal because ordinary people don't need to understand Newton's Laws of Motion in order to drive cars? Gas pedal. Pedal was allraedy known when the car was invented. The simple addition of gas solved that need. Oh, and it's break pedal, not descellarator. (sp?) Who are you to say that people shouldn't be exposed to words you deem that they don't need to know? I'm one of the 'people'. You say exposed to, I say bothered/bored with. I have nothing against the use of a proper, precise term. And that word can be a complex one with many, many sylables (seems to add value, somehow). But I'm not an academic, so I don't admire the pedantic use of terms that need to be explained to 'lay' people. Especially if there is a widespread, usually shorter and much simpler one for it. A pointless effort if pointless, even when comming from a physicist. :-) very well said, Rikishi42. this one is probably the most intelligent post in this thread. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: English Idiom in Unix: Directory Recursively
On May 25, 12:26 am, Thorsten Kampe thors...@thorstenkampe.de wrote: * Rikishi42 (Wed, 25 May 2011 00:06:06 +0200) On 2011-05-24, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: I think that is a patronizing remark that under-estimates the intelligence of lay people and over-estimates the difficulty of understanding recursion. Why would you presume this to be related to intelligence? The point was not about being *able* to understand, but about *needing* to understand in order to use. Maybe they don't need to understand recursion. So what? I think you should read the earlier posts again, this is drifting so far from what I intended. What I mean is: I'm certain that over the years I've had more than one person come to me and ask what 'Do you wish to delete this directory recursively?' meant. BAut never have I been asked to explain what 'Do you wish to delete this directory and it's subdirs/with all it's contents?' meant. Never. Naming something in the terms of its implementation details (in this case recursion) is a classical WTF. On the other hand, it's by far not the only WTF in Unix. For instance, how often have you read unlink instead of delete? Or directory instead of folder, pointing out that directory is the correct term because a directory is just a listing and does not contain the actual files. Of course these implementation details will never matter to anyone except under the rarest conditions. Thorsten well said. half of posts in this thread are from idiots. just incredible, but again, its newsgroups ... what am i thinking ... Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: English Idiom in Unix: Directory Recursively
On May 23, 9:28 pm, Chris Angelico ros...@gmail.com wrote: On Tue, May 24, 2011 at 2:20 PM, Xah Lee xah...@gmail.com wrote: why don't you file a bug report? In GNU Emacs 23.2, it's under the Help menu. I suppose it's the same in other emacs distro. Because I do not consider its behaviour to be errant. And I suspect its main developers won't either. That's why I suggested you grab the sources and make The Perfect Emacs. why don't you try http://ergoemacs.org/ ? Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: English Idiom in Unix: Directory Recursively
On May 22, 4:32 pm, Chris Angelico ros...@gmail.com wrote: On Mon, May 23, 2011 at 9:17 AM, Xah Lee xah...@gmail.com wrote: the context is this: In emacs directory manager (aka dired), when you call dired-do-delete on a directory, emacs prompts, this way: “Recursive delete of xx? (y or n)” But in order to make your point (such as it is), you are ignoring the fact that there are other uses of the term 'recurse' or 'recursive', and consistency and clarity are important. I don't see emacs offering me a chance to do a non-recursive delete; the only issue here seems to be that it's explicit that it is going to destroy an entire branch of the directory tree. If this is such a problem, grab the emacs sources and change that string - it probably occurs in exactly one place in the code. Voila! You now have The One True Perfect Emacs, the ultimate text editor, because it no longer tells you that it's working recursively. *removes tongue from cheek after saying that last sentence* Chris Angelico why don't you file a bug report? In GNU Emacs 23.2, it's under the Help menu. I suppose it's the same in other emacs distro. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: English Idiom in Unix: Directory Recursively
Xah wrote: «In the emacs case: “Recursive delete of xx? (y or n) ”, what could it possibly mean by the word “recursive” there? Like, it might delete the directory but not delete all files in it? » Jonathan de Boyne Pollard wrote: It might *try* to delete the directory but not any of its contents, yes. you mean theoretically you see a possibility if the dir is implement as stilted as unix, but never in your life you find yourself might want to do it? Xah -- http://mail.python.org/mailman/listinfo/python-list
Functional Programing: stop using recursion, cons. Use map vectors
this is important but i think most lispers and functional programers still don't know it. Functional Programing: stop using recursion, cons. Use map vectors. 〈Guy Steele on Parallel Programing〉 http://xahlee.org/comp/Guy_Steele_parallel_computing.html btw, lists (as cons, car, cdr) in the lisp world has always been some kinda cult. Like, if you are showing some code example and you happened to use lisp vector datatype and not cons (lists) and it doesn't really matter in your case, but some lisper will always rise up to bug you, either as innocent curious question or attacking you for not “understanding” lisp. (just as other idiocies happen in other lang that lispers see but other langs don't see) it's interesting to me that all other high level langs: Mathematica, perl, python, php, javascript, all don't have linked list as lisp's list. It's also curious that somehow lispers never realises this. I've been having problems with lisp's cons ever since i'm learning Scheme Lisp in 1998 (but mostly the reason is language design at syntax and lack of abstraction level in calling “cons, car, cdr” stuff, without indexing mechanism). Realizing the algorithmic property and parallel- execution issues of linked list is only recent years. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: English Idiom in Unix: Directory Recursively
On May 22, 3:46 pm, Chris Angelico ros...@gmail.com wrote: On Mon, May 23, 2011 at 6:22 AM, Xah Lee xah...@gmail.com wrote: Xah wrote: «In the emacs case: “Recursive delete of xx? (y or n) ”, what could it possibly mean by the word “recursive” there? Like, it might delete the directory but not delete all files in it? » Jonathan de Boyne Pollard wrote: It might *try* to delete the directory but not any of its contents, yes. you mean theoretically you see a possibility if the dir is implement as stilted as unix, but never in your life you find yourself might want to do it? There's a difference between working with a directory itself and working with files inside it. Generally, if you copy or delete a directory, you will want to recurse. But if you want to, for instance, wipe out all files whose names end with a tilde, then you might want to recurse and you might not. So it makes sense to offer the user a choice, and if recursive action is the only one that makes sense, at least acknowledge that the operation might take an arbitrarily long time. (Ever done a recursive operation on / on a large file system? Takes just a little bit longer than a non-recursive one under the same circumstances...) the context is this: In emacs directory manager (aka dired), when you call dired-do-delete on a directory, emacs prompts, this way: “Recursive delete of xx? (y or n)” Xah -- http://mail.python.org/mailman/listinfo/python-list
English Idiom in Unix: Directory Recursively
might be of interest. 〈English Idiom in Unix: Directory Recursively〉 http://xahlee.org/comp/idiom_directory_recursively.html -- English Idiom in Unix: Directory Recursively Xah Lee, 2011-05-17 Today, let's discuss something in the category of lingustics. You know how in unix tools, when you want to delete the whole directory and all sub-directories and files in it, it's referred as “recursive”? For example, when you want to delete the whole dir in emacs, it prompts this message: “Recursive delete of xx? (y or n) ”. (Note: to be able to delete whole dir in emacs in dired, you'll first need to turn it on. See: emacs dired tutorial.) Here's another example. A quote from “rsync” man page: … This would recursively transfer all files from the directory … -r, --recursive recurse into directories This tells rsync to copy directories recursively. See also -- dirs (-d). … Here's a quote from “cp”'s man page: -R, -r, --recursive copy directories recursively and lots of other tools has a “-r” option, and they all refer to it as “recursive”. Though, if you think about it, it's not exactly a correct description. “Recursive”, or “recursion”, refers to a particular type of algorithm, or a implementation using that algorithm. Obviously, to process all directory's content does not necessarily mean it must be done by a recursive algorithm. A iteration can do it as well and it's easy to have the full behavior and properties in the result as a recursive approach, such as specifying depth order, level to dive into, etc. (because, dir is a tree, and recursive algorithm is useful for walking the tree data structure but is not necessary, because a tree can be laid out flat. Any path order taken by a recursive approach can be done by just enumerating the nodes in sequence. In fact, iteration approach can be faster and simpler in many aspects. (i wrote a article about this some 10 years ago, see: Trees and Indexes.) Note: this thought about tree and its nodes as a set of node addresses can be applied to any tree data structure, such as lisp's nested syntax, XML. See: Programing Language: Fundamental Problems of Lisp.) If you look at Windows or Mac OS X world, i don't think they ever refer to dealing with whole dir as “recursive” in user interface. For example, in Windows Vista, while changing properties of a folder, it has this message: Apply changes to this folder only. Apply changes to this folder, subfolders and files. Note the second choice. In unix, it would say “Apply changes to this folder recursively.” So, the word “recursive” used in unixes may be technically incorrect, but more so, it's just not the right phrase. Because, we want to communicate whether the whole content of a directory are processed, not about certain algorithm or how it is implemented. A simple “all the dir's branches/contents” or similar would be more apt. Recently i was chatting in Second Life with someone (Sleeves). She's typing, while i'm on voice. In part of our conversation, i said “you sounded fine”. Note that it's technically incorrect, because she's typing, not on voice. So she didn't actually make any “sound”. But to say “you typed fine”, or “you chatted fine”, won't get the message across. That's idiom. When you interpret a idiom logically, it doesn't make much sense, but people understand the particular phrase better anyway. I suspect the “directory recursively” is also a idiom. It seems so natural and really gets the point across, without any ill effects. Even if the implementation actually used a iteration, it doesn't seems to matter. So the interesting question is, why this idiom works? Or, how it developed? I think, among programers (which all unix users are in the 1970s), every one knows the concept of recursion, and many unix tools on dir probably are implemented with a recursive algorithm. When you say “… recursively”, the point gets across, because we all understand it, even when we are not actually talking about implementation. The phrase “… directory recursively” is short and memorable, while “… directory and all its contents” or “… directory and all its branches” or “… directory and all its sub-directories and files” are wordy and unwieldy. ✍ Idiocy Of Unix Copy Command Emacs Lisp Suggestion: Function to Copy/Delete a Directory Recursively How to rsync, unison, wget, curl Hunspell Tutorial Mac OS X Resource Fork and Command Line Tips ImageMagick Tutorial Making System Calls in Perl and Python Unix And Literary Correlation The Unix Pestilence To An Or Not To An On “I” versus “i” (capitalization of first person pronoun) On the Postposition of Conjunction in Penultimate Position of a Sequence What's Passive Voice? What's Aggressive Voice? Why You Should Avoid The Jargon “Tail Recursion” Why You should Not Use The Jargon Lisp1 and Lisp2 Jargons
Re: Problems of Symbol Congestion in Computer Languages
On Mar 1, 3:40 pm, Chris Jones cjns1...@gmail.com wrote: At first it looks like something MS (Morgan Stanley..) dumped into the OSS lap fifteen years ago and nobody ever used it or maintained it.. so it takes a bit of digging to make it.. sort of work in current GNU/linux distributions.. especially since it knows nothing about Unicode. Here's the X/A+ map I came up with: // A+ keyboard layout: /usr/share/X11/xkb/symbols/apl // Chris Jones - 18/12/2010 // Enable via: // $ setxkbmap -v 10 apl default partial alphanumeric_keys modifier_keys xkb_symbols APL { name[Group1]= APL; // Alphanumeric section key TLDE { [ grave, asciitilde, 0x01fe, 0x017e ] }; key AE01 { [ 1, exclam, 0x01a1, 0x01e0 ] }; key AE02 { [ 2, at, 0x01a2, 0x01e6 ] }; key AE03 { [ 3, numbersign, 0x013c, 0x01e7 ] }; key AE04 { [ 4, dollar, 0x01a4, 0x01e8 ] }; key AE05 { [ 5, percent, 0x013d, 0x01f7 ] }; key AE06 { [ 6, asciicircum, 0x01a6, 0x01f4 ] }; key AE07 { [ 7, ampersand, 0x013e, 0x01e1 ] }; key AE08 { [ 8, asterisk, 0x01a8, 0x01f0 ] }; key AE09 { [ 9, parenleft, 0x01a9, 0x01b9 ] }; key AE10 { [ 0, parenright, 0x015e, 0x01b0 ] }; key AE11 { [ minus, underscore, 0x01ab, 0x0121 ] }; key AE12 { [ equal, plus, 0x01df, 0x01ad ] }; key AD01 { [ q, Q, 0x013f, 0x01bf ] }; key AD02 { [ w, W, 0x01d7, Nosymbol ] }; key AD03 { [ e, E, 0x01c5, 0x01e5 ] }; key AD04 { [ r, R, 0x01d2, Nosymbol ] }; key AD05 { [ t, T, 0x017e, Nosymbol ] }; key AD06 { [ y, Y, 0x01d9, 0x01b4 ] }; key AD07 { [ u, U, 0x01d5, Nosymbol ] }; key AD08 { [ i, I, 0x01c9, 0x01e9 ] }; key AD09 { [ o, O, 0x01cf, 0x01ef ] }; key AD10 { [ p, P, 0x012a, 0x01b3 ] }; key AD11 { [ bracketleft, braceleft, 0x01fb, 0x01dd ] }; key AD12 { [ bracketright, braceright, 0x01fd, 0x01db ] }; key AC01 { [ a, A, 0x01c1, Nosymbol ] }; key AC02 { [ s, S, 0x01d3, 0x01be ] }; key AC03 { [ d, D, 0x01c4, Nosymbol ] }; key AC04 { [ f, F, 0x015f, 0x01bd ] }; key AC05 { [ g, G, 0x01c7, 0x01e7 ] }; key AC06 { [ h, H, 0x01c8, 0x01e8 ] }; key AC07 { [ j, J, 0x01ca, 0x01ea ] }; key AC08 { [ k, K, 0x0127, Nosymbol ] }; key AC09 { [ l, L, 0x01cc, 0x01ec ] }; key AC10 { [ semicolon, colon, 0x01db, 0x01bc ] }; key AC11 { [ apostrophe, quotedbl, 0x01dd, 0x01bb ] }; key AB01 { [ z, Z, 0x01da, 0x01fa ] }; key AB02 { [ x, X, 0x01d8, Nosymbol ] }; key AB03 { [ c, C, 0x01c3, 0x01e3 ] }; key AB04 { [ v, V, 0x01d6, Nosymbol ] }; key AB05 { [ b, B, 0x01c2, 0x01e2 ] }; key AB06 { [ n, N, 0x01ce, 0x01ee ] }; key AB07 { [ m, M, 0x017c, 0x01cd ] }; key AB08 { [ comma, less, 0x01ac, 0x013c ] }; key AB09 { [ period, greater, 0x01dc, 0x01ae ] }; key AB10 { [ slash, question, 0x01af, 0x013f ] }; key BKSL { [ backslash,
Re: Problems of Symbol Congestion in Computer Languages
On Feb 28, 7:30 pm, rusi rustompm...@gmail.com wrote: On Feb 28, 11:39 pm, Dotan Cohen dotanco...@gmail.com wrote: You miss the canonical bad character reuse case: = vs ==. Had there been more meta keys, it might be nice to have a symbol for each key on the keyboard. I personally have experimented with putting the symbols as regular keys and the numbers as the Shifted versions. It's great for programming. Hmmm... Clever! Is it X or Windows? Can I have your setup? hi Russ, there's a programer's dvorak layout i think is bundled with linux. or you can do it with xmodmap on X-11 or AutoHotKey on Windows, or within emacs... On the mac, you can use keyboardMaestro, Quickeys, or just write a os wide config file yourself. You can see tutorials and sample files for all these here http://xahlee.org/Periodic_dosage_dir/keyboarding.html i'd be interested to know what Dotan Cohen use too. i tried the swapping number row with symbols a few years back. didn't like it so much because numbers are frequently used as well, especially when you need to enter a series of numbers. e.g. heavy math, or dates 2010-02-28. One can use the number pad but i use that as extra programable buttons. Xah One problem we programmers face is that keyboards were made for typists not programmers. Another is that when we move from 'hi-level' questions eg code reuse -- to lower and lower -- eg ergonomics of reading and writing code -- the focus goes from the center of consciousness to the periphery and we miss how many inefficiencies there are in our semi-automatic actions. -- http://mail.python.org/mailman/listinfo/python-list
Re: Problems of Symbol Congestion in Computer Languages
On 2011-02-16, Xah Lee wrote: │ Vast majority of computer languages use ASCII as its character set. │ This means, it jams multitude of operators into about 20 symbols. │ Often, a symbol has multiple meanings depending on contex. On 2011-02-17, rantingrick wrote: … On 2011-02-17, Cthun wrote: │ And you omitted the #1 most serious objection to Xah's proposal, │ rantingrick, which is that to implement it would require unrealistic │ things such as replacing every 101-key keyboard with 10001-key keyboards │ and training everyone to use them. Xah would have us all replace our │ workstations with machines that resemble pipe organs, rantingrick, or │ perhaps the cockpits of the three surviving Space Shuttles. No doubt │ they'd be enormously expensive, as well as much more difficult to learn │ to use, rantingrick. keyboard shouldn't be a problem. Look at APL users. http://en.wikipedia.org/wiki/APL_(programming_language) they are happy campers. Look at Mathematica, which support a lot math symbols since v3 (~1997) before unicode became popular. see: 〈How Mathematica does Unicode?〉 http://xahlee.org/math/mathematica_unicode.html word processors, also automatically do symbols such as “curly quotes”, trade mark sign ™, copyright sing ©, arrow →, bullet •, ellipsis … etc, and the number of people who produce document with these chars are probably more than the number of programers. in emacs, i recently also wrote a mode that lets you easily input few hundred unicode chars. 〈Emacs Math Symbols Input Mode (xmsi-mode)〉 http://xahlee.org/emacs/xmsi-math-symbols-input.html the essence is that you just need a input system. look at Chinese, Japanese, Korean, or Islamic. They happily type without requiring that every symbol they use must have a corresponding key on keyboard. Some lang, such as Chinese, that's impossible or impractical. when a input system is well designd, it could be actually more efficient than keyboard combinations to typo special symbols (such as in Mac OS X's opt key, or Windows's AltGraph). Because a input system can be context based, that it looks at adjacent text to guess what you want. for example, when you type = in python, the text editor can automatically change it to ≥ (when it detects that it's appropriate, e.g. there's a “if” nearby) Chinese phonetic input system use this extensively. Abbrev system in word processors and emacs is also a form of this. I wrote some thought about this here: 〈Designing a Math Symbols Input System〉 http://xahlee.org/comp/design_math_symbol_input.html Xah Lee -- http://mail.python.org/mailman/listinfo/python-list
Re: Problems of Symbol Congestion in Computer Languages
Chris Jones wrote: «.. from a quite different perspective it may be worth noting that practically all programming languages (not to mention the attached documentation) are based on the English language. And interestingly enough, most any software of note appears to have come out of cultures where English is either the native language, or where the native language is either relatively close to English.. Northern Europe mostly.. and not to some small extent, countries where English is well- established as a universal second language, such as India. Always struck me as odd that a country like Japan for instance, with all its achievements in the industrial realm, never came up with one single major piece of software.» btw, english is one of the two of India's official lang. It's used between Indians, and i think it's rare or non-existent for a college in india that uses local dialect. (this is second hand knowledeg. I learned this in Wikipedia and experience with indian co-workers) i also wondered about why japan doesn't seems to have created major software or OS. Though, Ruby is invented in Japan. I do think they have some OSes just not that popular... i think for special purposes OSes, they have quite a lot ... from Mitsubishi, NEC, etc... in their huge robotics industry among others. (again, this is all second hand knowledge) ... i recall having read non-english comp lang that appeared recently... Xah Lee -- http://mail.python.org/mailman/listinfo/python-list
Problems of Symbol Congestion in Computer Languages
might be interesting. 〈Problems of Symbol Congestion in Computer Languages (ASCII Jam; Unicode; Fortress)〉 http://xahlee.org/comp/comp_lang_unicode.html -- Problems of Symbol Congestion in Computer Languages (ASCII Jam; Unicode; Fortress) Xah Lee, 2011-02-05, 2011-02-15 Vast majority of computer languages use ASCII as its character set. This means, it jams multitude of operators into about 20 symbols. Often, a symbol has multiple meanings depending on contex. Also, a sequence of chars are used as a single symbol as a workaround for lack of symbols. Even for languages that use Unicode as its char set (e.g. Java, XML), often still use the ~20 ASCII symbols for all its operators. The only exceptions i know of are Mathematica, Fortress, APL. This page gives some examples of problems created by symbol congestion. --- Symbol Congestion Workarounds Multiple Meanings of a Symbol Here are some common examples of a symbol that has multiple meanings depending on context: In Java, [ ] is a delimiter for array, also a delimiter for getting a element of array, also as part of the syntax for declaring a array type. In Java and many other langs, ( ) is used for expression grouping, also as delimiter for arguments of a function call, also as delimiters for parameters of a function's declaration. In Perl and many other langs, : is used as a separator in a ternary expression e.g. (test ? yes : no), also as a namespace separator (e.g. use Data::Dumper;). In URL, / is used as path separators, but also as indicator of protocol. e.g. http://example.org/comp/unicode.html In Python and many others, is used for “less than” boolean operator, but also as a alignment flag in its “format” method, also as a delimiter of named group in regex, and also as part of char in other operators that are made of 2 chars, e.g.: = = . Examples of Multip-Char Operators Here are some common examples of operators that are made of multiple characters: || == = != ** =+ =* := ++ -- :: // /* (* … --- Fortress & Unicode The language designer Guy Steele recently gave a very interesting talk. See: Guy Steele on Parallel Programing. In it, he showed code snippets of his language Fortress, which freely uses Unicode as operators. For example, list delimiters are not the typical curly bracket {1,2,3} or square bracket [1,2,3], but the unicode angle bracket ⟨1,2,3⟩. (See: Matching Brackets in Unicode.) It also uses the circle plus ⊕ as operator. (See: Math Symbols in Unicode.) --- Problems of Symbol Congestion I really appreciate such use of unicode. The tradition of sticking to the 95 chars in ASCII of 1960s is extremely limiting. It creates complex problems manifested in: * String Escape mechanism (C's backslash \n, \/, …, widely adopted.) * Complex delimiters for strings. (Python's triple quotes and perl's variable delimiters q() q[] q{} m//, and heredoc. (See: Strings in Perl and Python ◇ Heredoc mechanism in PHP and Perl.) * Crazy leaning toothpicks syndrome, especially bad in emacs regex. * Complexities in character representation (See: Emacs's Key Notations Explained (/r, ^M, C-m, RET, return, M-, meta) ◇ HTML entities problems. See: HTML Entities, Ampersand, Unicode, Semantics.) * URL Percent Encoding problems and complexities: Javascript Encode URL, Escape String All these problems occur because we are jamming so many meanings into about 20 symbols in ASCII. See also: * Computer Language Design: Strings Syntax * HTML6: Your JSON and SXML Simplified Most of today's languages do not support unicode in function or variable names, so you can forget about using unicode in variable names (e.g. α=3) or function names (e.g. “lambda” as “λ” or “function” as “ƒ”), or defining your own operators (e.g. “⊕”). However, there are a few languages i know that do support unicode in function or variable names. Some of these allow you to define your own operators. However, they may not allow unicode for the operator symbol. See: Unicode Support in Ruby, Perl, Python, javascript, Java, Emacs Lisp, Mathematica. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: How to Write grep in Emacs Lisp (tutorial)
On Feb 11, 2:06 am, Alexander Gattin xr...@yandex.ru wrote: Hello, On Tue, Feb 08, 2011 at 05:32:05PM +, Icarus Sparry wrote: The key thing which makes this 'modern' is the '+' at the end of the command, rather than '\;'. This causes find to execute the grep once per group of files, rather than once per file. many thanks to you, man! I'm surprised to find out that this works on HP-UX B.11.31 and SunOS 5.9 (but not on HP Tru64 UNIX V5.1B). Is HP-UX still alive? lol. in 2000 i ported our ecommerce web app from Solaris to it. Am not exactly thrilled. At the time, i vaguely recall, the HP sales guys come to us and tells us they have this heart-beat technology ... Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: How to Write grep in Emacs Lisp (tutorial)
On Feb 8, 9:32 am, Icarus Sparry i.sparry...@gmail.com wrote: On Tue, 08 Feb 2011 13:51:54 +0100, Petter Gustad wrote: Xah Lee xah...@gmail.com writes: problem with find xargs is that they spawn grep for each file, which becomes too slow to be usable. find . -maxdepth 2 -name '*.html -print0 | xargs -0 grep whatever will call grep with a list of filenames given by find, only a single grep process will run. //Petter This is getting off-topic for the listed newsgroups and into comp.unix.shell (although the question was originally posed in a MS windows context). The 'modern' way to do this is find . -maxdepth 2 -name '*.html' -exec grep whatever {} + The key thing which makes this 'modern' is the '+' at the end of the command, rather than '\;'. This causes find to execute the grep once per group of files, rather than once per file. Nice. When was the + introduced? Xah -- http://mail.python.org/mailman/listinfo/python-list
Guy Steele on Parallel Programing
might be interesting. 〈Guy Steele on Parallel Programing〉 http://xahlee.org/comp/Guy_Steele_parallel_computing.html -- Guy Steele on Parallel Programing Xah Lee, 2011-02-05 A fascinating talk by the well respected computer scientist Guy Steele. (famously known as one of the author of Scheme Lisp) How to Think about Parallel Programming: Not! (2011-01-14) By Guy Steele. @ http://www.infoq.com/presentations/Thinking-Parallel-Programming The talk is a bit long, at 70 minutes. The first 26 minutes he goes thru 2 computer programs written for 1970's machines. It's quite interesting to see how software on punch card works. For most of us, we never seen a punch card. He actually goes thru it “line by line”, actually “hole by hole”. Watching it, it gives you a sense of how computers are like in the 1970s. At 00:27, he starts talking about “automating resource management”, and quickly to the main point of his talk, as in the title, about what sort of programing paradigms that are good for parallel programing. Here, parallel programing means solving a problem by utilizing multiple CPU or nodes (as in clusters or grid). This is important today, because CPU don't get much faster anymore; instead, each our computer are getting more CPU (multi-core). In the rest 40 min of the talk, he steps thru 2 programs that solves a simple problem of splitting a sentence into words. First program is typical sequential style, using do-loop (accumulator). The second program is written in his language Fortress, using functional style. He then summarizes a few key problems with traditional programing patterns, and introduces a few critical programing patterns that he thinks is critical for programing languages to automate parallel computing. In summary, as a premise, he believes that programers should not worry about parallelism at all, but the programing language should automatically do it. Then, he illustrates that there are few programing patterns that we must stop using, because if you do write your code in such paradigm, then it would be very hard to parallelize the code, either manually or by machine AI. If you are a functional programer and read FP news in the last couple of years, his talk doesn't cover much new. However, i find it very interesting, because: * ① This is the first time i see Guy Steele talk. He talks slightly fast. Very bright guy. * ② The detailed discussion of punch card code on 1970's machine is quite a eye-opener for those of us who's not in that era. * ③ You get to see Fortress code, and its use of fancy unicode chars. * ④ Thru the latter half of talk, you get a concrete sense of some critical “do's & don'ts” in coding paradigms about what makes automated parallel programing possible or impossible. In much of 2000s, i did not understand why compilers couldn't just automatically do parallelism. I thought about it in 2009, and realized why. See: Why Must Software Be Rewritten For Multi-Core Processors?. Parallel Computing vs Concurrency Problems Note that parallel programing and concurrency problem are not the same thing. Parallel programing is about writing code that can use multiple CPU. Concurrency problems is about problems of concurrent computations using the same resource or time (e.g. race condition, file locking). See: Parallel computing ◇ Concurrency (computer science) Fortress & Unicode It's interesting that Fortress language freely uses unicode chars. For example, list delimiters are not the typical {} or [], but the unicode angle bracket 〈〉. (See: Matching Brackets in Unicode.) It also uses the circle plus ⊕ as operator. (See: Math Symbols in Unicode.) I really appreciate such use of unicode. The tradition of sticking to the 95 chars in ASCII of 1960s is extremely limiting. It creates complex problems manifested in: * String Escape mechanism (C's “\n”, widely adopted.) * Complex delimiters (Python's triple quotes and perl's variable delimiters q() q[] q{} m//, and heredoc. (See: Strings in Perl and Python ◇ Heredoc mechanism in PHP and Perl.) * Crazy leaning toothpick syndrome, especially bad in emacs regex. * Complexities in character representation (See: Emacs's Key Notations Explained (/r, ^M, C-m, RET, return, M-, meta) ◇ HTML entities problems. See: HTML Entities, Ampersand, Unicode, Semantics.) * URL Percent Encoding problems and complexities: Javascript Encode URL, Escape String All these problems occur because we are jamming so many meanings into about 20 symbols in ASCII. See also: * Computer Language Design: Strings Syntax * HTML6: Your JSON and SXML Simplified Was this page useful to you? Also, almost all of today's languages do not support unicode in function or variable names, so you can forget about using unicode in variable names (e.g. α=3) or function names (e.g. “lambda” as “λ” or “function” as “ƒ”), or defining your own
Re: do you know what's CGI? (web history personal story)
Avatar was very disappointing (Both in graphics and story) but maybe i expect too much...? The story was clearly Pocahontas… in Space!, which was very disappointing. I have to disagree.. Loly. At this point, i must voice Xah's Point Of View. 〈Avatar and District 9 Movie Review〉 http://xahlee.org/Periodic_dosage_dir/skina/avatar.html -- Avatar and District 9 Movie Review Xah Lee, 2010-01-07 -- Avatar Went to watch the movie Avatar (2009 film) in theater today. Boo. On a scale of 1 to 10, i'd say this is no more than 7. This movie is totally predicable, stereotypical, intellectually shallow. The 3D effect isn't impressive at all, and about the only thing that is positive about this movie is the imaginative flora and fauna. This movies garnered raving reviews, both by critics as well as being a highly successful money maker. But it's so disappointing to me that i have to think about where to begin. -- 3D Effect Ok, lets begin at some easy criticisms, the 3D tech. I recall, back in late 1970s or early 1980s when i was about 10 or so, my mom's mom took me to see one of the first 3D film, in Taiwan, a kung fu film. I vividly recall that i physically dodged when the weapons swung towards me from the screen. Yeah, a lot people did that. That, is the effect of good 3D on you. But now, after 30 years, one'd suppose that the 3D tech has improved vastly, which it has. However, watching Avatar, i hardly get ANY 3D sense at all. In fact, i absolutely don't feel any 3D sense, perhaps a little, if i force my self to feel it, thru its 3D glasses. (I did not watch it on iMax) What's wrong? I don't know. Perhaps the 3D tech is different. I don't remember which 3D films i've watched back 30 years ago, but am guessing that some 3D tech are designed to have a exaggerated perspectivity, and am guessing the 3D tech used in this movie is designed to be more mellow or wide angle. But over all, i say bah. -- Predicable Ok, now i might disclose some of this movie's plot, and so here's your “spoiler” warning, but, the movie is so formula driven and stereotypical that it doesn't matter much. The movie, in one sentence, is about Western powers with high tech wanting to take over gold from some primitive, indigenous people, for their riches in their land. Yeah, that's it. And, yes, there's a hero, who gradually realized that this isn't right, and fell in love with one of the beautiful chick from the indigenous people (you guessed right, the daughter of a chieftain!), and saved the tribe, with the help of local animals and magical nature. The movies runs 2.5 hours. I didn't particular cry “move on already” at any point, but nor did the long movie had my attention wholly seized. There are no characters. All are shallow. The bad guys, in this case, the corporation head and the head of marines, are just what they are. The corporation head has eyes on gold, and that's his only concern. The marines head has bulky muscles, and is all about toughness. The hero, is just that, with good heart, and handsome to boot, courageous, always miraculously succeeds against all odds, and gets his girl. The heroin, in this case a alien race chick, is of course beautiful as much beauty we can put on a feline humanoid. And what's she like? Well, a beautiful woman, with concerns of loyalty of her man, love of her family, her people, a caring of nature. Actually had sex with the half-human half-alien hero. (inter-species porn anyone?) -- Where is the Science in Sci-Fi? What about the story line? Well, the human animals want this million- dollar land inhabited by primitive tribes. The human animals created what's called a “avatar”, which is a humanoid creature grown from bio- tubes that has mixed DNA from humans and the native feline-like humanoid aliens. The avatar is connected and controlled by a sleeping human. When one is awake, the other goes to sleep. Thru the avatars, it is thought that they can persuade the feline humanoids to move out. But the diplomatic cunning didn't work out, of course, and violence is resorted to. The hero fell in love with the heroin, and grew the sense of American Justice, and defended the primitive feline-humanoids with the help of miracle nature. The Sci-Fi aspect of the avatar concept is all interesting. How does the avatar work? How's it grown? How long does it take? How's the technology to control or connect it? What's the biology of the alien? What they eat? Well, this movie isn't concerned about these things, only that these settings qualify it as Sci-Fi flick. Another interesting aspect of sci-fi is that the plants and animals on this alien place have some sort bio-wire grown from their body, that allows direct animal-to-animal communication or animal-to-plant. For example, the feline-humanoid can connect her bio-wire grown from her
do you know what's CGI? (web history personal story)
some extempore thought. Do you know what is CGI? Worked with Mathematica for 5 hours yesterday. Fantastic! This old hand can still do something! lol. My plane curve packages soon to be out n am gonna be rich. ...gosh what godly hours i've spend on Mathematica in 1990s. Surprised to find that i even Unproctected builtin symbols to fix things. (get rid of asymptotes in ParametricPlot) (Draft notes as i go: Mathematica Version 3 to Version 7 Conversion Notes) ... i recall, i stopped doing Mathematica in 1998 because it's a career dead-end as a programing lang, and dived into the utterly idiotic Perl unix mysql world. (See: The Unix Pestilence ◇ Xah Lee's Computing Experience (Impression Of Lisp from Mathematica).) Well, dead-end just as Emacs Lisp i'm spending my nights with in the past 4 years. LOL. And on that note, same thing can be said with haskell, OCaml. Though, fringe langs are picking up these days. Remember Python, ruby, in year 2000? Who'd imagined they'd become mainstream. But it took 10+ years. (See: Language, Purity, Cult, and Deception.) Also got reminded my age recently. Someone on stackoverflow is asking about what are those “A:” and “B:” drives on Windows. (anyone heard of floppy drives?) In another incident, i was chatting to a friend, and the topic went to internet tech in 1990s, and i was telling him about how PHP (aka Pretty Home Page) came about, then naturally i discussed CGI. After a while, i realized, those who are around 20 years old today were under 10 in the 1990s. They wouldn't know what was CGI, and no amount of explanation can tell them exactly it was like, because it has become HISTORY — if you didn't live it, you can't feel it. http://xahlee.blogspot.com/2011/01/do-you-know-what-is-cgi.html Xah ∑ http://xahlee.org/ -- http://mail.python.org/mailman/listinfo/python-list
Re: opinion: comp lang docs style
On Jan 4, 3:17 pm, ru...@yahoo.com ru...@yahoo.com wrote: On 01/04/2011 01:34 PM, Terry Reedy wrote: On 1/4/2011 1:24 PM, an Arrogant Ignoramus wrote: what he called a opinion piece. I normally do not respond to trolls, but while expressing his opinions, AI made statements that are factually wrong at least as regards Python and its practitioners. Given that most trolls include factually false statements, the above is inconsistent. And speaking of arrogant, it is just that to go around screaming troll about a posting relevant to the newsgroup it was posted in because you don't happen to agree with its content. In doing so you lower your own credibility. (Which is also not helped by your Arrogant Ignoramus name-calling.) yeah. i called them idiots, he calls me Artificial Intelligence ☺. fair game. No. The language reference (LR) and standard library reference (SLR) must stand on their own merits. It is nice to have a good tutorial for those who like that style of learning. But it should be possible for a programmer with a basic understanding of computers and some other programming languages to understand how to program in python without referring to tutorials, explanatory websites, commercially published books, the source code, etc. yes exactly. the best python reference to me is Richard Gruet's quick ref: http://rgruet.free.fr/PQR26/PQR2.6.html on the python doc, afaik people complains all the time, and i know at least 3 times in different years people have tried to bring up projects to fix it, all shot down with spit badly by python priests, of course. just 2 days ago, i was pissed due to python doc url disappearance too http://xahlee.org/perl-python/python_doc_url_disappearance.html Xah -- http://mail.python.org/mailman/listinfo/python-list
opinion: comp lang docs style
a opinion piece. 〈The Idiocy of Computer Language Docs〉 http://xahlee.org/comp/idiocy_of_comp_lang.html -- The Idiocy of Computer Language Docs Xah Lee, 2011-01-03 Worked with Mathematica for a whole day yesterday, after about 10 years hiatus. Very nice. Mathematica lang and doc, is quite unique. Most other langs drivel with jargons, pettiness, comp-sci pretentiousness, while their content is mathematically garbage. (unixism mumble jumple (perl, unix), or “proper”-engineering OOP fantasy (java), or impractical and ivory-tower adacemician idiocy as in Scheme Haskell ( currying, tail recursion, closure, call-cc, lisp1 lisp2, and monad monad monad!)) (See: What are OOP's Jargons and Complexities ◇ Language, Purity, Cult, and Deception.) Mathematica, in its doc, is plain and simple. None of the jargon and pretention shit. Very easy to understand. Yet, some of its function's technical aspects are far more scholarly abstruse than any other lang (dealing with advanced math special functions that typically only a few thousand people in the world understand.). -- A Gander into the Idiocies Here's a gander into the doc drivel in common langs. -- unix In unix man pages, it starts with this type of garbage: SYNOPSIS gzip [ -acdfhlLnNrtvV19 ] [-S suffix] [ name ... ] gunzip [ -acfhlLnNrtvV ] [-S suffix] [ name ... ] zcat [ -fhLV ] [ name ... ] SYNOPSIS zip [-aabcddeeffghjkllmoqrrstuvvwx...@$] [-- longoption ...] [-b path] [-n suf fixes] [-t date] [-tt date] [zipfile [file ...]] [-xi list] Here, the mindset of unix idiots, is that somehow this “synopsis” form is technically precise and superior. They are thinking that it captures the full range of syntax in the most concise way. In practice, it's seldomly read. It's actually not accurate as one'd thought; no program can parse it and agree with the actual behavior. It's filled with errors, incomprehensible to human. Worse of all, the semantic of unix software's options are the worst rape to any possible science in computer science. See: The Nature of the Unix Philosophy ◇ Unix Pipe As Functional Language ◇ Unix zip Utility Path Problem. -- Python In Python, you see this kinda garbage: 7.1. The if statement The if statement is used for conditional execution: if_stmt ::= if expression : suite ( elif expression : suite )* [else : suite] (Source docs.python.org) Here, the mindset of the python idiots is similar to the unix tech geekers. They think that using the BNF notation makes their doc more clear and precise. The fact is, there are so many variations of BNF each trying to fix other's problem. BNF is actually not used as a computer language for syntax description. It's mostly used to communicate syntax to humans. Like regex, there are so many variations. But worse than regex in the sense that there are actually not many actual implementations of BNF. Real word syntax description language are usually nothing close to BNF. See: Pattern Matching vs Lexical Grammar Specification. This incomprehensible BNF notation is the only thing you get if you want to know the basic syntax of “if”, “for”, “while”, “lambda”, or other basic constructs of python. -- Perl In perl, you see this type of drivel: A Perl program consists of a sequence of declarations and statements which run from the top to the bottom. Loops, subroutines and other control structures allow you to jump around within the code. Perl is a free-form language, you can format and indent it however you like. Whitespace mostly serves to separate tokens, unlike languages like Python where it is an important part of the syntax. Many of Perl's syntactic elements are optional. Rather than requiring you to put parentheses around every function call and declare every variable, you can often leave such explicit elements off and Perl will figure out what you meant. This is known as Do What I Mean, abbreviated DWIM. It allows programmers to be lazy and to code in a style with which they are comfortable. Perl borrows syntax and concepts from many languages: awk, sed, C, Bourne Shell, Smalltalk, Lisp and even English. Other languages have borrowed syntax from Perl, particularly its regular expression extensions. So if you have programmed in another language you will see familiar pieces in Perl. They often work the same, but see perltrap for information about how they differ. (Source perldoc.perl.org) Notice they introduced you to their lingo “DWIM”. Juvenile humor is a characteristics of perl's docs. It's a whole cult. They have “perl republic”, “state of the onion”, “apocalypse”, “perl monger”, “perl golf”, etc.(See: Larry Wall and Cults.) Another trait is irrelevant rambling. For example, in the above you see
Re: Google AI challenge: planet war. Lisp won.
On Dec 20, 10:06 pm, Jon Harrop use...@ffconsultancy.com wrote: Wasn't that the challenge where they wouldn't even accept solutions written in many other languages (including both OCaml and F#)? Ocaml is one of the supported lang. See: http://ai-contest.com/starter_packages.php there are 12 teams using OCaml. See: http://ai-contest.com/rankings.php (click on the lang to see all teams using that lang) Xah -- http://mail.python.org/mailman/listinfo/python-list
Google AI challenge: planet war. Lisp won.
discovered this rather late. Google has a AI Challenge: planet wars. http://ai-contest.com/index.php it started sometimes 2 months ago and ended first this month. the winner is Gábor Melis, with his code written in lisp. Congrats lispers! Gábor wrote a blog about it here http://quotenil.com/Planet-Wars-Post-Mortem.html (not sure if this has been mentioned here but quick search didn't find it) Xah ∑ http://xahlee.org/ ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: Land Of Lisp is out
On Oct 28, 12:59 am, Lawrence D'Oliveiro l...@geek- central.gen.new_zealand wrote: In message 3fe80ac4-b595-4bcb-96b9-9138b1ec5...@l17g2000yqe.googlegroups.com, TheFlyingDutchman wrote: On Oct 27, 4:55 pm, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: Would it be right to say that the only Lisp still in common use is the Elisp built into Emacs? There is a new version of Lisp called Clojure that runs on the Java Virtual Machine (JVM) that is on the upswing. Now is not exactly a good time to build new systems crucially dependent on the continuing good health of Java though, is it? java's been receiving shit in recent years... Sun went belly up, Apple bans it, Oracle sues... but one thing to note that it is currently the most popular lang, or top 3, among C, C++. And java is certainly better than these 2. Sad to know, but java, along with its jvm, is likely to be with us for a while. btw, interesting to know that the landoflisp site mentioned Common Lisp, Scheme lisp, clojure, arc lisp, and emacs lisp in the vid too, but didn't newLisp or Qi lisp. here's my fav part of the comics http://xahlee.org/comp/land_of_lisp.html Conrad is certainly a fervent lisp lover. ( as is Peter Seibel http://www.gigamonkeys.com/book/ ) Conrad is also a comics artist. The landoflisp is going all over on twitter yesterday and apparantly many already ordered it. Hope he does very well. Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: Land Of Lisp is out
On Oct 28, 1:42 am, p...@informatimago.com (Pascal J. Bourguignon) wrote: sthueb...@googlemail.com (Stefan Hübner) writes: Would it be right to say that the only Lisp still in common use is the Elisp built into Emacs? Clojure (http://clojure.org) is a Lisp on the JVM. It's gaining more and more traction. There are actually 2 REAL Lisp on the JVM: - abclhttp://common-lisp.net/project/armedbear/and - CLforJavahttp://www.clforjava.org lol. He said REAL! how about the 10 Scheme Lisps on JVM? guess they are UNREAL. lol btw, who cross posted this thread to python? i call troll! Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: is list comprehension necessary?
On Oct 27, 5:46 pm, rantingrick rantingr...@gmail.com wrote: On Oct 26, 4:31 am, Xah Lee xah...@gmail.com wrote: recently wrote a article based on a debate here. (can't find the original thread on Google at the moment) Hey all you numbskulls who are contributing the annoying off-topic chatter about Report Lab need to... 1) GET A LIFE 2) START A NEW THREAD! :) when i saw the thread got quietly hijacked to about PDF... it was funny. But yeah, they need to get a life. Xah -- http://mail.python.org/mailman/listinfo/python-list
is list comprehension necessary?
recently wrote a article based on a debate here. (can't find the original thread on Google at the moment) • 〈What's List Comprehension and Why is it Harmful?〉 http://xahlee.org/comp/list_comprehension.html it hit reddit. http://www.reddit.com/r/programming/comments/dw8op/whats_list_comprehension_and_why_is_it_harmful/ though, i don't find the argument there informative. For python, i can understand that it might be preferred, due to the special syntax, being more in sync with python because of the imperative hints in keywords. (e.g. those “for”, “if” in it.) But for more pure functional lang (e.g. haskell), i think lc is pretty bad. here's the plain text version of my essay What's List Comprehension and Why is it Harmful? Xah Lee, 2010-10-14 This page explains what is List Comprehension, with examples from several languages, with my opinion on why the jargon and concept of “list comprehension” are unnecessary, and harmful to functional programing. What is List Comprehension? Here's a example of List Comprehension (LC) in python: S = [2*n for n in range(0,9) if ( (n % 2) == 0)] print S # prints [0, 4, 8, 12, 16] It generates a list from 0 to 9 by 「range(0,9)」, then remove the odd numbers by 「( (n % 2) == 0)」, then multiply each element by 2 in 「2*n」, then returns a list. Python's LC syntax has this form: [myExpression for myVar in myList if myPredicateExpression] In summary, it is a special syntax for generating a list, and allows programers to also filter and apply a function to the list, but all done using expressions. In functional notation, list comprehension is doing this: map( f, filter(list, predicate)) Other languages's LC are similiar. Here are some examples from Wikipedia. In the following, the filter used is 「x^2 3」, and the 「2*x」 is applied to the result. Haskell s = [ 2*x | x - [0..], x^2 3 ] F# seq { for x in 0..100 do if x*x 3 then yield 2*x } ;; OCaml [? 2 * x | x - 0 -- max_int ; x * x 3 ?];; Clojure (take 20 (for [x (iterate inc 0) :when ( (* x x) 3)] (* 2 x))) Common Lisp (loop for x from 1 to 20 when ( (* x x) 3) collect (* 2 x)) Erlang S = [2*X || X - lists:seq(0,100), X*X 3]. Scala val s = for (x - Stream.from(0); if x*x 3) yield 2*x Here's how Wikipedia explains List comprehension. Quote: A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. The following features makes up LC: * (1) A flat list generator, with the ability to do filtering and applying a function. * (2) A special syntax in the language. * (3) The syntax uses expressions, not functions. Why is List Comprehension Harmful? • List Comprehension is a opaque jargon; It hampers communication, and encourage mis-understanding. • List Comprehension is a redundant concept in programing. It is a very simple list generator. It can be easily expressed in existing functional form 「map(func, filter(list, predicate))」 or imperative form e.g. perl: 「for (0..9) { if ( ($_ % 2) == 0) {push @result, $_*2 }}」. • The special syntax of List Comprehension as it exists in many langs, are not necessary. If a special purpose function is preferred, then it can simply be a plain function, e.g 「LC(function, list, predicate)」. Map + Filter = List Comprehension Semantics The LC's semantics is not necessary. A better way and more in sync with functional lang spirit, is simply to combine plain functions: map( f, filter(list, predicate)) Here's the python syntax: map(lambda x: 2*x , filter( lambda x:x%2==0, range(9) ) ) # result is [0, 4, 8, 12, 16] In Mathematica, this can be written as: Map[ #*2 , select[ra...@9, EvenQ]] In Mathematica, arithemetic operations can be applied to list directely without using Map explicitly, so the above can be written as: select[ra...@9, EvenQ] * 2 in my coding style, i usually write it in the following syntactically equivalent forms: (#*2 ) @ (Select[#, EvenQ]) @ Range @ 9 or 9 // Range // (Select[#, EvenQ]) // (#*2 ) In the above, we sequence functions together, as in unix pipe. We start with 9, then apply “Range” to it to get a list from 1 to 9, then apply a function that filters out odd numbers, then we apply a function to multiply each number by 2. The “//” sign is a postfix notation, analogous to bash's “|”, and �...@” is a prefix notation that's the reverse of “|”. (See: Short Intro of Mathematica For Lisp Programers.) List Comprehension Function Without Special Syntax Suppose we want some “list comprehension” feature in a functional lang. Normally, by default this can be done by map(func, filter(inputList, Predicate)) but perhaps this usage is so frequent that we want to create a new function for it, to make it more convenient, and perhaps easier to make the compiler to optimize more. e.g. LC(func, inputList, Predicate) this is about whether a lang should create a new convenient function that otherwise require 3 function
how to name a function in a comp lang (design)
A great piece about terminology in computer languages. * 〈The Poetry of Function Naming〉 (2010-10-18) By Stephen Wolfram. At: http://blog.stephenwolfram.com/2010/10/the-poetry-of-function-naming/ See also: • 〈The Importance of Terminology's Quality In Computer Languages〉 http://xahlee.org/UnixResource_dir/writ/naming_functions.html where i gave some examples of the naming. Xah ∑ http://xahlee.org/ ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: how to name a function in a comp lang (design)
On Oct 20, 4:52 am, Marc Mientki mien...@nonet.com wrote: Am 20.10.2010 13:14, schrieb Xah Lee: See also: • 〈The Importance of Terminology's Quality In Computer Languages〉 http://xahlee.org/UnixResource_dir/writ/naming_functions.html where i gave some examples of the naming. I'd like to introduce a blog post by Stephen Wolfram, on the design process of Mathematica. In particular, he touches on the importance of naming of functions. The functions in Mathematica, are usually very well-named, in contrast to most other computing languages. Let me give a few example. [...] thanks for your post. didn' t know you also use Mathematica. on the aspect of function naming, i think Mathematica is rather unique in its philosophy. Am not aware any other lang old or new follows a similar philosophy... possibly except javascript. It is much easier to improve something good than to invent from scratch. When Lisp was born, Stephen Wolfram was still wearing diapers. For your information: Mathematica was my first Lisp-like language. I used it about 10 years almost every day and I love it because of the beauty of the concept. But Mathematica has two serious problems: first, there is only one implementation and it is commercial, and secondly, Mathematica is very, very slowly and does not generate executable code that can be used without Mathematica itself. Thus, comparisons to other languages, such as Lisp are not fair. you are right... thought these aspects don't have much to do with function naming. i tend to think that Mathematica is that way due to a unique mind, Stephen Wolfram. And if i may say, i share much mindset with him with respect to many lang design issues. (or rather, Mathematica was my first lang for about 6 years too) But i think rather, Mathematica's lang design philosophy more has to do with certain pure mathematician mindset. This is somewhat similar to how haskell is a lang designed such that it is much independent of any concept of hardware. Same here with Mathematica, but on the naming aspect, Mathematica's function names is designed without even much relation to comp sci lingoes, but rather, the essense of ideas captured in a mathematical way. Xah ∑ http://xahlee.org/ ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: toy list processing problem: collect similar terms
On Sep 25, 9:05 pm, Xah Lee xah...@gmail.com wrote: here's a interesting toy list processing problem. I have a list of lists, where each sublist is labelled by a number. I need to collect together the contents of all sublists sharing the same label. So if I have the list ((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t)) where the first element of each sublist is the label, I need to produce: output: ((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t)) ... thanks all for many interesting solutions. I've been so busy in past month on other computing issues and writing and never got around to look at this thread. I think eventually i will, but for now just made a link on my page to point to here. now we have solutions in perl, python, ruby, common lisp, scheme lisp, mathematica. I myself would also be interested in javascript perhps i'll write one soon. If someone would go thru all these solution and make a good summary with consistent format/names of each solution... that'd be very useful i think. (and will learn a lot, which is how i find this interesting) PS here's a good site that does very useful comparisons for those learning multiple langs. * 〈Lisp: Common Lisp, Scheme, Clojure, Emacs Lisp〉 http://hyperpolyglot.wikidot.com/lisp * 〈Scripting Languages: PHP, Perl, Python, Ruby, Smalltalk〉 http://hyperpolyglot.wikidot.com/scripting * 〈Scripting Languages: Bash, Tcl, Lua, JavaScript, Io〉 http://hyperpolyglot.wikidot.com/small * 〈Platform Languages: C, C++, Objective C, Java, C#〉 http://hyperpolyglot.wikidot.com/c * 〈ML: Standard ML, OCaml, F#, Scala, Haskell〉 http://hyperpolyglot.wikidot.com/ml Xah ∑ http://xahlee.org/ ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: Unicode Support in Ruby, Perl, Python, Emacs Lisp
2010-10-09 On Oct 9, 3:45 pm, Sean McAfee eef...@gmail.com wrote: Xah Lee xah...@gmail.com writes: Perl's exceedingly lousy unicode support hack is well known. In fact it is the primary reason i “switched” to python for my scripting needs in 2005. (See: Unicode in Perl and Python) I think your assessment is antiquated. I've been doing Unicode programming with Perl for about three years, and it's generally quite wonderfully transparent. you are probably right. The last period i did serious perl is 1998 to 2004. Since, have pretty much lost contact with perl community. i have like 5 years of 8 hours day experience with perl... the app we wrote is probably the largest perl web app at the time, say within the top 10 largest perl web apps, during the dot com days. spend 2 years with python about 2005, 2006, but mostly just personal dabbling. my dilema is this... i am really tired of perl, so i thougth python is my solution. Comparing the syntax, semantics, etc, i really do find python better, but to know python as well as i know perl, or, to know a lang really as a expert (e.g. intimately familiar with all the ins and outs of constructs, idioms, their speeds, libraries out there, their nature, which are used, their bugs etc), takes years. So, whenever i have this psychological urge to totally ditch perl and hug python 100% ... but it takes a huge amount of time to dig into a lang well again, so sometimes i thought of sticking with my perl due to my existing knowledge and forthwith stop wasting valuable time, but then, whenever i work in perl with its hack nature and crooked community (all those mongers fuck), especially the syntax for nested list/hash that's more than 3 levels (and my code almost always rely on nested list/hash to do things since am a functional programer), and compare to python's syntax on nested structure, i ask my self again, is this shit really what i want to keep on at? and python 3 comes in, and over the years i learned, that Guido really hates functional programing (he understands it nil), and python is moving more innto oop mumbo jumbo with more special syntaxes and special semantics. (and perl is trivially far more capable at functional programing than python) So, this puts a damnation in my mental struggle for python. in the end i really haven't decided on anything, as usual... it's not really concrete, answerable question anyway, it's just psy struggle on some fuzzy ideal about efficiency and perfect lang. and there's ruby... (among others) and because i'm such a douchbag for langs, now and then i suppose i waste my time to venture and read about ruby, the unconcious execuse is that maybe ruby will turn out to simply solve all my life's problems, but nagging in the back of my mind is the reality that, yeah, go spend 3 years 8 hours a day on ruby, then possibly it'll be practically useful to me as i do with perl already, and, no, it won't bring you anything extra as far as lang goes, for that you go to OCaml/F#, erlang, Mathematica ... and who knows what kinda hidden needle in the eye i'll discover on my road in ruby. btw, this is all just a geek's mental disorder, common with many who's into lang design and beauty etc type of shit. (high percentage of this crowd hang in newsgroups) But the reality is that, this psychological problem really don't have much practical justification ... it's just fret, fret, fret. Fret, fret, fret. Years of fretting, while others have written great apps all over the web. in practice, i do not even have a need for perl or python in my work since about 2006, except a few find/replace scripts for text processing that i've written in the past. And, since about 2007, i've been increasingly writing lots and lots more in elisp. (and this emacs beast, is really a true love more than anything) So these days, almost all of my scripts are in elisp. (and my job these days is mainly just text processing programing) • 〈Xah on Programing Languages〉 http://xahlee.org/Periodic_dosage_dir/comp_lang.html On the programmers' web site stackoverflow.com, I flag questions with the unicode tag, and of questions that mention a specific language, Python and C++ seem to come up the most often. I'll have to say, as far as text processing goes, the most beautiful lang with respect to unicode is emacs lisp. In elisp code (e.g. Generate a Web Links Report with Emacs Lisp ), i don't have to declare none of the unicode or encoding stuff. I simply write code to process string or buffer text, without even having to know what encoding it is. Emacs the environment takes care of all that. It's not quite perfect, though. I recently discovered that if I enter a Chinese character using my Mac's Chinese input method, and then enter the same character using a Japanese input method, Emacs regards them as different characters, even though they have the same Unicode code point. For example, from describe-char: character: 一 (43323, #o124473, #xa93b
Unicode Support in Ruby, Perl, Python, Emacs Lisp
here's my experiences dealing with unicode in various langs. Unicode Support in Ruby, Perl, Python, Emacs Lisp Xah Lee, 2010-10-07 I looked at Ruby 2 years ago. One problem i found is that it does not support Unicode well. I just checked today, it still doesn't. Just do a web search on blog and forums on “ruby unicode”. e.g.: Source, Source, Source, Source. Perl's exceedingly lousy unicode support hack is well known. In fact it is the primary reason i “switched” to python for my scripting needs in 2005. (See: Unicode in Perl and Python) Python 2.x's unicode support is also not ideal. You have to declare your source code with header like 「#-*- coding: utf-8 -*-」, and you have to declare your string as unicode with “u”, e.g. 「u林花謝了春紅」. In regex, you have to use unicode flag such as 「re.search(r'\.html $',child,re.U)」. And when processing files, you have to read in with 「unicode(inF.read(),'utf-8')」, and printing out unicode you have to do「outF.write(outtext.encode('utf-8'))」. If you are processing lots of files, and if one of the file contains a bad char or doesn't use encoding you expected, your python script chokes dead in the middle, you don't even know which file it is or which line unless your code print file names. Also, if the output shell doesn't support unicode or doesn't match with the encoding specified in your python print, you get gibberish. It is often a headache to figure out the locale settings, what encoding the terminal support or is configured to handle, the encoding of your file, the which encoding the “print” is using. It gets more complex if you are going thru a network, such as ssh. (most shells, terminals, as of 2010-10, in practice, still have problems dealing with unicode. (e.g. Windows Console, PuTTY. Exception being Mac's Apple Terminal.)) Python 3 supposedly fixed the unicode problem, but i haven't used it. Last time i looked into whether i should adopt python 3, but apparently it isn't used much. (See: Python 3 Adoption) (and i'm quite pissed that Python is going more and more into OOP mumbo jumbo with lots ad hoc syntax (e.g. “views”, “iterators”, “list comprehension”.)) I'll have to say, as far as text processing goes, the most beautiful lang with respect to unicode is emacs lisp. In elisp code (e.g. Generate a Web Links Report with Emacs Lisp ), i don't have to declare none of the unicode or encoding stuff. I simply write code to process string or buffer text, without even having to know what encoding it is. Emacs the environment takes care of all that. It seems that javascript and PHP also support unicode well, but i don't have extensive experience with them. I suppose that elisp, php, javascript, all support unicode well because these langs have to deal with unicode in practical day-to-day situations. -- for links, see http://xahlee.blogspot.com/2010/10/unicode-support-in-ruby-perl-python.html Xah ∑ xahlee.org ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: (and scheme lisp) x Python and modern langs [was Re: gossip, Guy Steel, Lojban, Racket]
On Sep 29, 11:02 am, namekuseijin namekusei...@gmail.com wrote: On 28 set, 19:38, Xah Lee xah...@gmail.com wrote: • “list comprehension” is a very bad jargon; thus harmful to functional programing or programing in general. Being a bad jargon, it encourage mis-communication, mis-understanding. I disagree: it is a quite intuitive term to describe what the expression does. what's your basis in saying that “list comprehension” is intuitive? any statics, survery, research, references you have to cite? to put this in context, are you saying that lambda, is also intuitive? “let” is intuitive? “for” is intuitive? “when” is intuitive? I mean, give your evaluation of some common computer language termilogies, and tell us which you think are good and which are bad, so we have some context to judge your claim. For example, let us know, in your view, how good are terms: currying, lisp1 lisp2, tail recursion, closure, subroutine, command, object. Or, perhaps expound on the comparative merits and meaning on the terms module vs package vs add-on vs library. I would like to see your view on this with at least few paragraphs of analysis on each. If you, say, write a essay that's at least 1k words on this topic, then we all can make some judgement of your familiarity and understanding in this area. Also, “being intuitive” is not the only aspect to consider whether a term is good or bad. For example, emacs's uses the term “frame”. It's quite intuitive, because frame is a common english word, everyone understands. You know, door frame, window frame, picture frame, are all analogous to emacs's “frame” on a computer. However, by some turn of history, in computer software we call such as “window” now, and by happance the term “window” also has a technical meaning in emacs, what we call “split window” or “pane” today. So, in emacs, the term “frame” and “window” is confusing, because emacs's “frame” is what we call “window”, while emacs's “window” is what me might call a pane of a split window. So here, is a example, that even when a term is intuitive, it can still be bad. as another example, common understanding by the target group the term is to be used is also a important aspect. For example, the term “lambda”, which is a name of greek char, does not convey well what we use it for. The word's meaning by itself has no connection to the concept of function. The char happens to be used by a logician as a shorthand notation in his study of what's called “lambda calculus” (the “calculus” part is basically 1700's terminology for a systematic science, especially related to mechanical reasoning). However, the term “lambda” used in this way in computer science and programing has been long and wide, around 50 years in recent history (and more back if we trace origins). So, because of established use, here it may decrease the level of what we might think of it as a bad jargon, by the fact that it already become a standard usage or understanding. Even still, note that just because a term has establish use, if the term itself is very bad in many other aspects, it may still warrant a need for change. For one example of a reason, the argon will be a learning curve problem for all new generations. You see, when you judge a terminology, you have to consider many aspects. It is quite involved. When judging a jargon, some question you might ask are: • does the jargon convey its meaning by the word itself? (i.e. whether the jargon as a word is effective in communication) • how long has been the jargon in use? • do people in the community understand the jargon? (e.g. what percentage) each of these sample questions can get quite involved. For example, it calls for expertise in linguistics (many sub-fields are relevant: pragmatics, history of language, etymology), practical experience in the field (programing or computer science), educational expertise (e.g. educators, professors, programing book authors/teachers), scientific survey, social science of communication... also, you may not know, there are bodies of professional scientists who work on terminologies for publication. It is not something like “O think it's good, becus it is intuitive to me.”. I wrote about 14 essays on various jargons in past decade. You can find them on my site. i removed your arguments on other parts about “list comprehension”, because i didn't find them valuable. (barely read them) However, i appreciate your inputs on the “do” in Scheme lisp has a functional usage, and some other misc chat info from the beginning of this thread on comp.lang.lisp. Xah ∑ xahlee.org ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: toy list processing problem: collect similar terms
On Sep 27, 9:34 pm, John Bokma j...@castleamber.com wrote: Seebs usenet-nos...@seebs.net writes: fup set to poster On 2010-09-28, John Bokma j...@castleamber.com wrote: Seebs usenet-nos...@seebs.net writes: On 2010-09-26, J?rgen Exner jurge...@hotmail.com wrote: It was livibetter who without any motivation or reasoning posted Python code in CLPM. Not exactly; he posted it in a crossposted thread, which happened to include CLPM and other groups, including comp.lang.python. It is quite possible that he didn't know about the crossposting. Oh, he does. It has been Xah's game for years. But did livibetter know about it? I wasn't defending Xah, who is indeed at the very least clueless and disruptive. Heh, he's not clueless, the problem is that he knows exactly what he's doing. And like most spammers, very hard to get rid off. But I was sort of defending the poster who was accused of posting Python code in CLPM, because that poster may not have understood the game. Ah, clear. Well, the problem is somewhat also in CLPM where people somehow have to reply to messages like this :-(. And I am sure Xah laughes his ass off each time it happens. Hi John Bokma, can you stop this? doesn't seems fruitful to keep on this. if you don't like my posts, ignore them? i don't post in comp.lang.python or comp.lang.perl.misc that often... maybe have done so 5 times this year. i visited your home page http://johnbokma.com/mexit/2010/08/15/ and there are really a lot beautiful photos. this isn't bribery or something like that. I've been annoyed by you, of course, but it's not fruitful to keep going on this. Best, Xah ∑ xahlee.org ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: (and scheme lisp) x Python and modern langs [was Re: gossip, Guy Steel, Lojban, Racket]
xah wrote: in anycase, how's “do” not imperative? On Sep 28, 6:27 am, namekuseijin namekusei...@gmail.com wrote: how's “do” a “named let”? can you show example or reference of that proposal? (is it worthwhile?) I'll post it again in the hope you'll read this time: (do ((i 0 (+ 1 i)) ; i initially 1, (+ 1 i) at each step (r 0 (+ i r))) ; r initially 0, (+ i r) at each step (( i 5) r)) ; when i5, return r = 15 it's merely a macro (syntax) that gets transformed into this: (let loop ((i 0) (r 0)) (if ( i 5) r (loop (+ 1 i) (+ i r = 15 which is merely a macro that essentially gets transformed into this: ((lambda (loop) (loop loop 0 0)) (lambda (loop i r) (if ( i 5) r (loop loop (+ 1 i) (+ i r) = 15 which, as you can see, is merely function application. There's nothing there except evaluation of arguments, application of arguments to function and function return. It's not because they chose `do', or `for' or `while' for naming such syntax, that it behaves the same as their imperative homographs. as i said, regarding do: “do” in general in any lang is simply impreative. We don't even have to know the details. I don't care whatnot fuck proposal from whatnot lisp of what's actually going on. If it is named “do”, it is imperative. It's not: one can name factorial do. It's just a name. Who doesn't like do? It's short, to the point... ultimately, all lang gets transformed at the compiler level to become machine instructions, which is imperative programing in the ultimate sense. You say that “do” is merely macro and ultimately function application. But to what level should we go down this chain on how the language actually works when evaluating a function in source code? any functional lang, quickly becomes imperative when compiled to some intermediate code or interpreted. In a sense, it can't be any other way. Functions are abstract mathematical ideas, while “do” loop or any instruction are actual steps of algorithms. Xah ∑ xahlee.org ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: (and scheme lisp) x Python and modern langs [was Re: gossip, Guy Steel, Lojban, Racket]
2010-09-28 On Sep 28, 12:07 pm, namekuseijin namekusei...@gmail.com wrote: On 28 set, 14:56, Xah Lee xah...@gmail.com wrote: ultimately, all lang gets transformed at the compiler level to become machine instructions, which is imperative programing in the ultimate sense. You say that “do” is merely macro and ultimately function application. But to what level should we go down this chain on how the language actually works when evaluating a function in source code? any functional lang, quickly becomes imperative when compiled to some intermediate code or interpreted. In a sense, it can't be any other way. Functions are abstract mathematical ideas, while “do” loop or any instruction are actual steps of algorithms. That is true as of Mathematica too. Difference being that do in scheme is pretty-much user-level syntax. If you look in most (good) implementations sources, do and let are defined in scheme itself, not C or lower-level: C code only deals with transforming lambda application and tail calls into proper gotos. So, as far as we're talking about scheme, haskell or Mathematica code, it's all functional in its abstraction context. do syntax does allow for imperative commands to be issued in scheme, just like let. It does not mean it was used in said examples nor that do is automatically inherently imperative just because of choice of name. imperative do - (do (steppers ...) (final-condition? result) malign-imperative-code-here ...) so, now that we got it clear why do in scheme is not (inherently) imperative why were you bashing useful (at times) functional syntax in the form of list comprehensions again? le's get precise. The argument i want to make, is this: • “list comprehension” is a very bad jargon; thus harmful to functional programing or programing in general. Being a bad jargon, it encourage mis-communication, mis-understanding. • “list comprehension” is a redundant concept in programing. The concept is of historical interest. e.g. when people talk about the history of computer languages. The LC can simply be filter(map(func, list), predicate). • The special syntax of “list comprehension” as it exists in many langs, are not necessary. It can simply be a plain function, e.g LC(function, list, filter). I gave a stand-alone explanation of these points at: http://groups.google.com/group/comp.lang.lisp/msg/329b3b68ff034453 Do you disagree or agree with the above? This is the point _I_ want to argue about, but you don't seem to admit any part of it, but you seems to want to discuss about “do” in Scheme lisp being functional. So, perhaps we can now focus on this subject: The “do” in Scheme lisp is not imperative, or, it can be considered as functional. Alright. To be honest, i haven't had enough experience to comment on, but in general, i understand the example you've given, and i disagree. Full report on your argument on this is given at: http://groups.google.com/group/comp.lang.lisp/msg/87a987070e80231f Now, in your message (quoted above), you made further argument on this. I think the main point is this, quote: «do syntax does allow for imperative commands to be issued in scheme, just like let. It does not mean it was used in said examples nor that do is automatically inherently imperative just because of choice of name. imperative do - (do (steppers ...) (final-condition? result) malign-imperative-code-here ...)» i don't think your argument is forceful enough. It appears that by this argument you even say that “let” is not functional. Here, the issue verges on what is functional? What is a function? If a function in lisp is defined as macro, does it ceases to be considered as a function? Likewise, if a lisp's has “for” loop that is defined as a macro, is that “for” now considered a function? this is getting quite iffy. What level or aspect are we considering? In each lang, usually they define certain terms specifically to the lang, and to various degree of precision. For eample, the term “object” means very different things in a technical way in different langs. Same for the word “function”, “keyword”, “command”, “module”, “package” ... So, overall, i consider your argument for “do” in Scheme lisp being functional is weak, or trivial. It seems to be a pet peeve. You got annoyed because i seem to have ignored it entirely. But i got annoyed by you because you don't get the point about what i consider more significant opinion on “list comprehension”, which you totally ignored and kept at hacking the “do” in Scheme lisp. Xah ∑ xahlee.org ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: (and scheme lisp) x Python and modern langs [was Re: gossip, Guy Steel, Lojban, Racket]
2010-09-27 For instance, this is far more convenient: [x+1 for x in [1,2,3,4,5] if x%2==0] than this: map(lambda x:x+1,filter(lambda x:x%2==0,[1,2,3,4,5])) How about this: LC(func, inputList, P) compared to [func for myVar in inputList if P] the functional form is: • shorter • not another idiysyncratic new syntax now, a separate issue. Suppose we want some “list comprehension” feature in a functional lang. Normally, by default this can be done by filter( map(func, inputList), Predicate) but perhaps this usage is so frequent that we want to create a new fuction for it, to make it more convenient, and perhaps easier to make the compiler to optimize more. e.g. LC(func, inputList, Predicate) this is about whether a lang should create a new convenient function that otherwise require 2 function combinations. Common Lisp vs Scheme Lisp are the typical example of extreme opposites. note, there's no new syntax involved. Now, let's consider another separated issue related to so-called “list comprehension”. Suppose we decided that generating list by a filter is so frequently used that it worth it to create a new func for it. LC(func, inputList, Predicate) Now, in functional langs, in general a design principle is that you want to reduce the number of function unless you really need. Because, any combination of list related functions could potentionally be a new function in your lang. So, if we really think LC is useful, we might want to generalize it. e.g. in LC(func, inputList, Predicate) is it worthwhile say to add a 4th param, that says return just the first n? (here we presume the lang doesn't support list of infinite elements) e.g. LC(func, inputList, Predicate, n) what about partition the list to m sublists? LC(func, inputList, Predicate, n, m) what about actualy more generalized partition, by m sublist then by m1 sublist then by m2 sublist? LC(func, inputList, Predicate, n, list(m,m1,m2,...)) what about sorting? maybe that's always used together when you need a list? LC(func, inputList, Predicate, n, list(m,m1,m2,...), sortPredcate) what if actually frequently we want LC to map parallel to branches? e.g. LC(func, inputList, Predicate, n, list(m,m1,m2,...), sortPredcate, mapBranch:True) what if ... you see, each of these or combination of these can be done by default in the lang by sequenceing one or more functions (i.e. composition). But when we create a new function, we really should think a lot about its justification, because otherwise the lang becomes a bag of functions that are non-essential, confusing. In summary: • “list comprehension” is a bad jargon. • The concept of “list comprehension” is redundant. There's no justification for the concept to exist except historical. • The syntax of “list comprehension” in most lang is ad hoc syntax. for those who find imperative lang good, then perhaps “list comprehension” is good, because it adds another idiosyncratic syntax to the lang, but such is with the tradition of imperative langs. The ad hoc syntax aids in reading code by various syntactical forms and hint words such as “[... for ... in ...]”. Xah ∑ xahlee.org ☄ -- http://mail.python.org/mailman/listinfo/python-list
Re: (and scheme lisp) x Python and modern langs [was Re: gossip, Guy Steel, Lojban, Racket]
On Sep 27, 12:11 pm, namekuseijin namekusei...@gmail.com wrote: On 27 set, 16:06, Xah Lee xah...@gmail.com wrote: 2010-09-27 For instance, this is far more convenient: [x+1 for x in [1,2,3,4,5] if x%2==0] than this: map(lambda x:x+1,filter(lambda x:x%2==0,[1,2,3,4,5])) How about this: [snip] how about this: read before replying. hum??? i read your post quite carefully, and rather thought i replied well. In fact, i really wanted to tell you “read before replying” before but refrained from making any of that expression. here's 2 previous posts about list compre. http://groups.google.com/group/comp.lang.lisp/msg/145f6ecf29ebbdaf http://groups.google.com/group/comp.lang.lisp/msg/62ca84062c9fcdca Xah -- http://mail.python.org/mailman/listinfo/python-list
Re: toy list processing problem: collect similar terms
2010-09-26 On Sep 25, 11:17 pm, Paul Rubin no.em...@nospam.invalid wrote: Python solution follows (earlier one with an error cancelled). All crossposting removed since crossposting is a standard trolling tactic. from collections import defaultdict def collect(xss): d = defaultdict(list) for xs in xss: d[xs[0]].extend(xs[1:]) return list(v for k,v in sorted(d.items())) y = [[0,'a','b'], [1,'c','d'], [2,'e','f'], [3,'g','h'], [1,'i','j'], [2,'k','l'], [4,'m','n'], [2,'o','p'], [4,'q','r'], [5,'s','t']] print collect(y) Hi Paul, thanks for your solution, and thanks to many other who provided solutions. It'll take some time to go thru them. btw, i disagree about your remark on crossposting. For those interested, you can read in the following: • 〈Cross-posting amp; Language Factions〉 http://xahlee.org/Netiquette_dir/cross-post.html if anyone wants to argue with me about this, there's a my blog link at the bottom of the page where you can leave a comment. Feel free to use that. i'll go over the solutions and post if i have anything interesting to say. ☺ Possbly will select some to show on my site with credit of course. Xah ∑ xahlee.org ☄ -- http://mail.python.org/mailman/listinfo/python-list