2015-12-20

[SHARE] Top 10 Python libraries of 2015

Top 10 Python libraries of 2015
As the new year approaches, we often sit back and think about what we have accomplished in 2015. Many of our projects would not have been as successful if it…

via Instapaper http://ift.tt/1mlUtTO

2015-12-19

[SHARE] Analyzing 91 years of Time magazine covers for visual trends - PyImageSearch

Analyzing 91 years of Time magazine covers for visual trends - PyImageSearch
Today’s blog post will build on what we learned from last week: how to construct an image scraper using Python + Scrapy to scrape ~4,000 Time magazine cover…

via Instapaper http://ift.tt/1Lk0RD4

Reverse HTTP (PTTH)

Reverse HTTP 是基於 HTTP 制定的通訊協定,其縮寫為 PTTH。目的為將使用者端(client)發送請求(request)而伺服器端(server)回應(response)的行為反轉,變成伺服器端發送請求而使用者端回應。透過這樣的反轉機制,使用者端可以接收從伺服器端發出的事件(event)或通知(notification),如此一來使用者端就不需要定期詢問伺服器端的狀態,伺服器端有發生變動時,會透過 PTTH 連線通知使用者端。使用範例可參考 Reverse HTTP。這個通訊協定為草稿階段(here)。會發現這個通訊協定是因為Apple TV利用 PTTH 來通知手持裝置影片的播放狀態的改變。

我在 GitHub 上發現有 Ruby 的實作(here),但是沒看到有 Python 的實作,因此就自己做了一份 Python 版的 PTTH Client (here)。原本想把實作建構於 Requests 之上,但是不知道怎麼取得 Requests 中 socket 的控制權,以便接收伺服器端的請求,因此直接用 socket 實作。

2015-12-13

[SHARE] Install OpenCV 3.0 and Python 2.7+ on OSX - PyImageSearch

Install OpenCV 3.0 and Python 2.7+ on OSX - PyImageSearch
As I mentioned last week, OpenCV 3.0 is finally here! And if you’ve been paying attention to my Twitter stream , you may have noticed a bunch of tweets…

via Instapaper http://ift.tt/1R0sAJo

2015-11-21

[SHARE] Stevey's Blog Rants: Get that job at Google

Stevey's Blog Rants: Get that job at Google
I've been meaning to write up some tips on interviewing at Google for a good long time now. I keep putting it off, though, because it's going to make you mad.…

via Instapaper http://ift.tt/LsykLP

2015-11-01

利用Python Folium建立互動式地圖

Folium


Folium 提供 Leaflet 的 Python API 建立互動式地圖,結合 Python 適合用於資料分析的優勢,同時利用 Python 將分析的數據轉換成互動式地圖。Folium 的使用方式是利用所有的資料點產出一個 HTML 檔案,在這個 HTML 檔案中,會引入 Leaflet 相關的 JavaScript,使用者不需要考慮太多 Leaflet 的問題,只要專注於資料點的產出就好。如果習慣用 Jupyter (IPythonNotebook) 來處理資料,可以參考 examples.ipynb,利用 iframe 將 Folium 產出的互動式地圖嵌入 Jupyter 的 Output 中。

Samples


用一個簡單的專案--tourist-map,來介紹 Folium 的使用方式。這個專案標注台北市內公共場所飲水機、行人專用清潔箱、免費無線上網熱點、公廁的點位資訊,方便遊客搜尋使用。資料的來源為台北市政府資料開放平台。這幾項資料的格式都是 CSV 或 XML,透過 pandasxmltodict 就可以簡單處理。

公共場所飲水機 (data source)


Folium 的使用方式很簡單,先指定地圖的定位點 location 以及起始的放大倍率 zoom_start。接著使用 simple_marker 標註每一個資料點,location 為點的位置,點選標記會跳出的顯示訊息由 popup 提供。設定 clustered_marker=True 會把太靠近的點先組成群組,當地圖放大時才展開,打開這個功能會讓畫面變得比較清爽,而且繪圖時間比較短。



行人專用清潔箱 (data source)


行人專用清潔箱的資料是按照行政區來分檔案,依序讀入12個 CSV 檔案中每個資料點的位置再標注於地圖上即可。



免費無線上網熱點 (data source)


免費無線上網熱點的資料來源為 XML 檔案,利用 xmltodict 將 XML 轉換為 Python 的 dict。接著依照 XML 的結構取出每個資料點的位置,依序標注即可。



公廁 (data source)


處理方式跟上面的一樣。



Reference


* Creating interactive crime maps with Folium

2015-10-26

[SHARE] Testing Your Code — The Hitchhiker's Guide to Python

Testing Your Code — The Hitchhiker's Guide to Python
Testing your code is very important. Getting used to writing the testing code and the running code in parallel is now considered a good habit. Used wisely, this…

via Instapaper http://ift.tt/1ektVrD

2015-10-23

[SHARE] So, you want to give a lightning talk?

So, you want to give a lightning talk?
Lightning talks… what are lightning talks? Why are they so much fun? Why do we love them? An instant cure for death by PowerPoint, a lightning talk is an…

via Instapaper http://ift.tt/203i7US

[SHARE] Better interactive data science with Beaker and Rodeo

Better interactive data science with Beaker and Rodeo
Domino has offered support for IPython/Jupyter for a while, but we recently added support for two newer, up-and-coming tools for interactive data science:…

via Instapaper http://ift.tt/1MpQZe4

2015-10-18

[SHARE] bliki: TellDontAsk

bliki: TellDontAsk
tags: Tell-Don’t-Ask is a principle that helps people remember that object-orientation is about bundling data with the functions that operate on that data. It…

via Instapaper http://ift.tt/15GybMo

2015-10-17

東京喫茶店巡禮

這次去東京除了參加PyCon JP 2015之外,另一個重要的行程就是去嚐嚐日本的咖啡。同樣受到第三波咖啡的影響,日本也有越來越多主打淺焙,富有果酸的咖啡店。但是我平常不愛酸味太強的咖啡,喜歡烘焙較深帶有後韻甜味的咖啡。而且對於老派喫茶店有莫名的憧憬,因此這次也走訪了幾家有名的咖啡店。

カフェ・ド・ランブル(link)

大家口中的琥珀咖啡,位於銀座與新橋之間。只有咖啡,其他的沒有。富有歷史感的老店充滿溫暖的氛圍。以現在的眼光來看,沒有誇張華麗的裝潢。專注於咖啡的呈現,咖啡豆、烘焙、咖啡杯、手沖。日本老牌的喫茶店似乎很喜歡用自己設計的餐具,琥珀咖啡的杯子也是自己的,店內也有販售。我點的這杯是03年的瓜地馬拉,用很輕很薄的骨瓷盛裝。入口微酸,接著散發出陳年豆特別的味道,最後有一點點的焦香。陳年咖啡是琥珀的賣點,生豆儲藏十年以上才烘焙,不知道是不是類似熟成牛排的概念。店員說十年以上的豆子不會影響原來的味道,例如酸味變強之類,但是會多了一股特別的味道。

センリ軒(link)

位於築地市場的センリ軒。大部分去東京旅遊的人,一定會一大早去築地市場排隊吃生鮮魚料理。我喜歡築地市場一大早的活力,順便看看排隊的人潮有沒有上次來的時候誇張。但是吃過一次海鮮丼當早餐後,我發現這種吃法實在不是我的菜。センリ軒位於海鮮丼有名的仲家隔壁,我們大概是唯一的一組觀光客,很多客人一進門,店員只問今天要喝熱的還是冰的,根本不需要點餐。這裡的牛油吐司很好吃,配上一杯熱咖啡,這才像早餐啊!咖啡不是他的賣點,就一般深焙的黑咖啡,會來這家純粹是為了牛油土司來的。據說築地市場還有另一間喫茶店也不錯,叫做愛養(link),就在非常非常有名的寿司大隔壁,下次一定要去試試看。

カフェーパウリスタ(link)

カフェーパウリスタ位在琥珀咖啡附近,會來這間完全是衝著他開業於1911年,擁有百年的歷史。早餐的司康套餐還不錯,附上奶油及藍莓醬,配上パウリスタオールド,香氣十足。不過我們遇到的服務生有點糟,知道我們會說中文就用中文服務,但是問他菜單的細節,又解釋的不清不楚,都不知道用日文問會不會好一點。最怒的是我的司康還沒吃完就急著把我的盤子收走,明明店裡也沒幾個客人,是有這麼缺盤子嗎!完全不推薦!

神田伯刺西爾(link)

神田伯刺西爾在神保町古書街附近,本來在神保町找了幾家想嘗試的咖啡店,路上聽到路人說這間是附近最好喝的咖啡,就決定是它了。一樣是老派的裝潢路線,手繪的甜點menu非常有趣。點了一份戚風蛋糕配上神田ぶれんど,招牌綜合豆的口味濃偏苦還帶點煙燻味,幾乎不酸,非常適合搭配甜點。戚風蛋糕的口感很細緻,鮮奶油完全不膩,第一次吃到除了紅葉蛋糕以外不膩的鮮奶油,而且鮮奶油上面還有杏仁片,增加口感。這家的咖啡實在是太合我的口味,臨走前還帶了一包咖啡豆回家自己沖沖看。

2015-10-15

日本的電視在教R


打開電視轉到這台時嚇到了,居然在教R,很好奇收視率不知道如何?

Trello旅行規劃術



以前出國玩會用Google Sheets規劃行程(here),一項優點是可以多人編輯。而表格化的格式讓所有資訊都按照一定的結構排列,雖然新增資料很方便,但實際上不同類型的資訊會放在不同的tab,使用上不夠直觀。每一天的行程安排會一直修修改改,表格化的格式常常需要一整個區塊剪下貼上,不斷重複。

後來發現用Trello來規劃行程(here)還蠻方便且直觀的,使用Trello的好處是用一個Card紀錄一項資訊,Card除了有說明外還可以附上連結或圖片。我會定義幾個重要的List,包含Information, Backlog, Gift以及每一天的行程。使用Trello的好處是圖文並茂,而且拖拉的方式讓修改行程變得非常方便,同時也支援多人編輯。

Information
放班機資訊、住宿資訊、預算以及規劃行程時常會查詢的資訊;行前用Checklist確認重要的東西沒有遺漏。

Backlog
行程規劃暫存區,可以是一個景點,一家店,一場展覽,甚至是一個區域。習慣用不同顏色的Label分類,這樣可以很容易辨別。

每一天的行程
原則上就是把Backlog中的Card拖到每天的行程規劃中,按照時間序列來排,拖拉的方式讓修改行程變得很方便。也可以使用Card註記交通轉乘的方式。

Gift
要買的東西及禮物。

2015-10-06

[SHARE] Getter、Setter的用與不用

Getter、Setter的用與不用
在Java界,有個該不該使用Getter、Setter的老問題,不單是初學者經常覺得多此一舉,就連老手們偶而也會從封裝、維護、抽象化等角度,戰上數回。…

via Instapaper http://ift.tt/1QVq5J2

2015-10-04

2015-09-27

[SHARE] Charming Python: Iterators and simple generators

Charming Python: Iterators and simple generators
Start your free trial New constructs in Python 2.2 Python 2.2 introduces a new construct accompanied by a new keyword. The construct is generators; the keyword…

via Instapaper http://ift.tt/1YHYelr

[SHARE] Python Programming Interview Questions with Answers Part-1

Python Programming Interview Questions with Answers Part-1
Today we are going to present you Part-1 of top 20 Python programming interview questions. We’ve selected these questions after exhaustive research. The…

via Instapaper http://ift.tt/1VgkfZo

[SHARE] Git Hooks

Git Hooks
Git hooks are scripts that run automatically every time a particular event occurs in a Git repository. They let you customize Git’s internal behavior and…

via Instapaper http://ift.tt/1KFOSfB

2015-09-23

[SHARE] What is HDR for TVs, and why should you care? - CNET

What is HDR for TVs, and why should you care? - CNET
Geoffrey Morrison HDR, or high dynamic range, is poised to be the next big thing in TVs. We've been talking about it for several years , but finally a few…

via Instapaper http://ift.tt/1itJrtC

[SHARE] Silicon Valley spends money on people, Asia spends it on marketing

Silicon Valley spends money on people, Asia spends it on marketing
If you’re around Asia’s startup scene long enough, you’ll start to hear the same tunes over and over again: solve a problem, target a big market, develop a…

via Instapaper http://ift.tt/1Mu7uWh

2015-09-22

[SHARE] 【創報】Ben Horowitz:初次創業成功最重要的是 PMF,其他都是次要的

【創報】Ben Horowitz:初次創業成功最重要的是 PMF,其他都是次要的
Posted by: der 2015 年 09 月 15 日 in 所有文章 若說 Ben Horowitz 是矽谷最知名、最成功且受人尊敬的創投者之一,應該沒有人會否認。他曾在初期投資了 Facebook、Twitter、Groupon 和 Skype 等後來十分成功的企業,並共同創辦了 Andreessen…

via Instapaper http://ift.tt/1NADmJj

2015-09-21

閱讀清單

想看的書好多,書單(here)增加的速度遠超過閱讀的速度 XDDD

2015-09-17

WISDOM acrostic -- understanding an audience

[W]hat do you want them to learn?
What is their [i]nterest in what you've got to say?
How [s]ophisticated are they?
How much [d]etail do they want?
Whom do you want to [o]wn the information?
How can you [m]otivate them to listen to you?

2015-09-15

[SHARE] 5 Questions Every Unit Test Must Answer — JavaScript Scene

5 Questions Every Unit Test Must Answer — JavaScript Scene
Every developer knows we should write unit tests in order to prevent defects from being deployed to production. What most developers don’t know are the…

via Instapaper http://ift.tt/1LJUZWk

2015-09-13

[SHARE] nealford.com • Knowledge Breadth versus Depth

nealford.com • Knowledge Breadth versus Depth
Mark Richards , my co-presenter in the Software Architecture Fundamentals video series , is a a deeply knowledgable architect who doesn’t blog much, so I’m…

via Instapaper http://ift.tt/1Q1P0KA

手沖咖啡與水



前一陣子颱風造成自來水混濁,跑去costco買了這款礦泉水。因為有每天沖咖啡的習慣,也改用礦泉水來沖咖啡,發現沖出來的味道大不相同。與平常用的自來水再過濾後的水相比,咖啡的後韻比較明顯,層次也較多。喝了一個月的礦泉水手沖咖啡,有種回不去的感覺,想不到水對手沖咖啡的影響也是蠻大的。

Aberfoyle Springs - Canadian Mountain Spring Water
容量:1.5L
硬度:63.5mg/L

2015-09-12

很有趣的文具店,沒有招牌,自己按公寓電鈴上樓,文具控的桃花源


via Instagram http://ift.tt/1NnUVur

via Instagram http://ift.tt/1O6WgHD

:set relativenumber



Recently I found a setting in vim -- :set relativenumber, that gives a better life. From most tutorials or .vimrc references, I usually see :set number to show line number for better tracking codes. With only line number, however, I need to calculate range of codes in mind for deleting or yanking and often get wrong number. With :set relativenumber, there is no need to manually calculate line numbers. All relative line distances, either up-side or down-side, from current line are shown on side-bar. This is helpful to manipulate several lines of text at one time.

Review of "Apprenticeship Patterns"

This is truly a great reference book for software craftsmanship.

Initially Dave gives the story about his experience of software craftsmanship. Then the definitions of software craftsmanship, being an apprentice, being a journeyman, being a master are given. Also apprenticeship and apprenticeship pattern are introduced. In following chapters, Dave and Ade use Context-Problem-Solution-Action pattern to address different situations we may encounter in our programming life. Every situation is a signal to improve ourselves and this book points "how-to" when we are stuck in these situations.

To be level up, we need to forget who we are first (emptying the cup). Then think about what we want and the long-term goal (walking the long road). Once the direction is chosen, just do it and don't be afraid of failures (accurate self-assessment). There is no really end of software craftsmanship, and all we could do is to get better and better in our whole life (perpetual learning, construct your curriculum).

In Chinese Kong-Fu, there are two categories. One is to train our skill, and the other is to train our mind. "Apprenticeship Patterns" is more like the latter category. Instead of hard skills, this book gives us some guidances in each stage of software craftsmanship on self-improvement.

2015-09-07

Check if socket is broken in Python

Context
You set up a socket from client to server for communication. And you want to get informed when socket is broken.

Problem
If the socket is closed from server, client side receives no more data from socket. However, if server is shutdown in accident or network connection is broken, you may not get any hint.

Solution
Set socket option with TCP keep-alive as following:

_socket.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
try:
    _socket.recv(1024)
except socket.error as e:
    handle_socket_error(e)

When socket is broken, TCP keep-alive is failed. Then socket.error is raised while socket.recv(). By default keep-alive time is no less than two hours and this exception may be raised up to two hours. If you want more responsive on socket broken, need to reduce the keep-alive time.

For Linux:
_socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, after_idle_sec)
_socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, interval_sec)
_socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, max_fails)

For OSX:
TCP_KEEPALIVE = 0x10
_socket.setsockopt(socket.IPPROTO_TCP, TCP_KEEPALIVE, interval_sec)

For Windows:
_socket.ioctl(socket.SIO_KEEPALIVE_VALS, (1, keepalive_time_msec, keepalive_interval_msec))

Note on Windows platform, the unit of time is millisecond.

Reference
[1] https://docs.python.org/2/library/socket.html
[2] http://stackoverflow.com/a/14855726/1249320
[3] https://msdn.microsoft.com/en-us/library/dd877220%28v=vs.85%29.aspx

2015-09-05

[SHARE] Code Review Best Practices

Code Review Best Practices
At Wiredrive, we do a fair amount of code reviews. I had never done one before I started here so it was a new experience for me. I think it’s a good idea to…

via Instapaper http://ift.tt/1EYnN7M

[SHARE] 如何入門敏捷專案開發?(導讀)

如何入門敏捷專案開發?(導讀)
我第一次接觸到敏捷開發這個詞,大約是 2008 年。本來剛入門對於這個詞是懵懵懂懂,但隨著經歷大大小小過百個專案,試過十數套專案進行方式後。在這七年當中,我逐漸累積建立自己一套系統性的心法與作法。 常有朋友問我「敏捷開發」要如何學習? 當然,最快的方式,我會推薦你來報 敏捷專案管理班2015年版…

via Instapaper http://ift.tt/1Ng1dO0

[SHARE] Overview of Single vs. Multi Server Architecture

Overview of Single vs. Multi Server Architecture
Each time I post a setup guide for configuring a Django server there are questions about how I came upon the Nginx and Apache multi-server approach as opposed…

via Instapaper http://ift.tt/1NgV3x8

2015-09-04

Staple-less Stapler

Staple-less Stapler
I first encountered these tools on a site that dealt with Japanese imports. Apparently they’re more popular there than in North America. They’re basically…

September 4, 2015 at 11:10AM
via Instapaper http://ift.tt/1J3rSHr

日本的「咖啡文化」——內涵不斷深化,形式愈發多樣 | nippon.com 日本網

日本的「咖啡文化」——內涵不斷深化,形式愈發多樣 | nippon.com 日本網
地方城市的咖啡館:現代日本咖啡文化的象徵 店員引領我就座後,打開菜單,只見咖啡欄下的飲品足足有20多種。比如「Saza調」、「Saza Glorious(哥倫比亞)」、「Gorda Los Pirineos農場(薩爾瓦多)」、「肯亞」、「曼特寧(印尼)」、「Geisha Natural…

September 4, 2015 at 11:09AM
via Instapaper http://ift.tt/1Ofed3x

2015-07-15

捷運時刻表終於出現



最近搭捷運發現車站出現時刻表,終於看得到每個站的時刻表。

台北捷運早該公布各站的時刻表,我以為軌道運輸最大的優勢在於準點而非速度。在這之前,通勤族都是憑經驗在搭車;公布時刻表後,大家更容易掌握要搭的班次,應該會減少捷運站內飛奔的情形。對需要轉乘的人來說也是一大福音,因為將更容易銜接轉乘的班次。

不過官網目前對於各站時刻表只有提供PDF(路網圖、各站資訊及時刻表),查詢實在不太方便。

補充:朋友貼連結說已經有整理好的版本(台北捷運班次時刻表)

2015-06-28

虎屋羊羹+山田哥斯大黎加


via Instagram http://ift.tt/1BYhjrG

鴿朋友


via Instagram http://ift.tt/1Jfbo0R

機會成本

Opportunity Cost (OC)。維基百科是這麼寫的,
In microeconomic theory, the opportunity cost of a choice is the value of the best alternative forgone, in a situation in which a choice needs to be made between several mutually exclusive alternatives given limited resources.
翟本喬在創新是一種態度中提到:機會成本,才是公司最大的成本。
機會成本簡單來說,是指你決定去做一件事情而使用了一些資源,包括金錢、時間、人力等等,而這些資源如果用在別的事情上,你能得到的最大利益。
在這本書中,翟本喬舉了台北市公車專用道以及玩線上遊戲或學習當例子。拆除公車專用道花費四百萬元,兩年後重建花費六百萬元。金錢的花費至少一千萬元,但是考慮機會成本後,不拆公車專用道造成的交通阻礙、浪費的時間等等,很有可能超過一千萬元。線上遊戲的金錢花費是兩百小時四百五十元(月卡),但是這兩百小時若用來學習精進某個領域,有條件找到更好的工作,就算每個月只多一千元的薪資,一輩子下來會多幾十萬元。

書中談論機會成本的主軸是員工福利。為什麼Google的福利這麼好?因為考量過機會成本後,給予這樣的福利可以為Google帶來最大的效益。我認為可以得出這樣的結論的原因在於,對所有成本的管控都非常精細。不單單只是考慮有形的成本,連最難評估的人力時間(man-hour)成本都要能精準的掌握。

看了這本書後發現,平常在做決定的時候,沒有想過要考慮機會成本。應該要把考量機會成本內化成一種習慣。

2015-06-27

[Clean Code] 閱讀筆記 - Ch. 6

第六章的標題是物件及資料結構(Objects and Data Structures)。這個章節可以歸納為兩個重點:物件(object)與資料(data)的差異;德摩特爾法則(The Law of Demeter)。

使用單純的資料結構時,只定義成員(member)而不提供方法(method),就像是C語言的struct [link],資料內的成員可以直接被使用,所有的邏輯運算都在資料結構的外部操作。以Python實作的話如下:
使用物件時,同時定義成員與方法,但是成員變數被封裝在物件內,所有的操作只能透過物件提供的方法。Python實作如下:
物件與資料結構間存在著反對稱性(anti-symmetry)。從外部來看,資料結構提供可操作的成員,但是無法使用方法(因為根本沒有);物件提供可操作的方法,但是隱藏其內部成員的細節(Python無法封裝其成員不被外部使用,但是可以透過命名慣例來區隔避免誤用:self._fooself.__bar)。

這樣的反對稱性,會影響新功能的擴充。假設未來想要增加一個計算周長的方法,在資料結構的實作中,只需要變更class Geometry就好,在其內部增加def perimeter即可;但是對於物件的實作來說,卻需要變更class Shape, class Square, class Rectangle, class Circle,變動非常大。相反地,假設要增加一個新的形狀class Triangle,在物件的實作中,就只需要新增class Triangle(Shape);在資料結構的實作中,除了新增class Triangle外,還必須修改class Geometry內所有的操作方法。

使用物件好還是使用資料結構好?需要看未來可能的需求而定。如果類別可能增加但是類別的方法不會再改變,比較適合使用物件;如果類別變動的機會少,但是有可能一直增加新的方法,比較適合使用資料結構。

這一章的另一個主題為The Law of Demeter,這個法則是針對物件的實作的定。
More precisely, the Law of Demeter says that a method f of a class C should only call the methods of these:
* C
* An object created by f
* An object passed as an argument to f
* An object held in an instance variable of C
Python的實作: The Law of Demeter提供物件封裝其成員變數的法則,避免外部直接取得物件的成員而操作,所有對物件成員的操作必須封裝在物件的方法之內。物件本身必然瞭解其內部成員如何使用與操作,因此將運算邏輯實作於物件的方法內是合理的,也讓整體的程式架構更好維護。這樣做的好處是,當該物件必須變更其內部成員時,外部的使用者不需要修改其操作,如此一來可將所有的變更侷限於該物件本身,程式的維護性更好。

2015-06-19

git merge or rebase?

昨天在Hacking Thursday (H4)參與Git Workflow的讀書會,突然想到一個問題:
feature branch完成後,在發送pull request前,如果master branch(或develop branch)又有新的commit,如圖

這時候應該要先git rebasegit mergefeature branch追上master branch才好呢?
一直以來我都是用git mergefeature branch的程式碼整理好才發pull request,也不知道我這樣用到底是不是大家習慣的用法,想不到在H4也是兩派人都有。上網一查才發現,這個問題的爭論,不亞於Emacsvim哪種編輯器比較好?看了兩篇文章Git team workflows: merger or rebase?Merge or Rebase?,覺得整理得還不錯。

使用git merge會像這樣

這種做法的優點是:容易了解與使用;保留完整的歷史紀錄,便於追蹤;如果feature branch也有分享出去的話,用git merge才不會搞爛大家的歷史紀錄。而其最大的缺點就是會讓歷史紀錄變得非常複雜,如果分支(fork)的人很多,歷史紀錄很有可能變成網狀的,而不再只是樹枝狀的結構。

使用git rebase會像這樣

這種做法最大的優點就是:簡化歷史紀錄,讓雜訊消失,變得更容易閱讀。但是其缺點是:git rebase使用起來不夠直覺,而且處理衝突(conflict)時不方便;由於歷史紀錄被簡化過,部分的資訊會消失,不利於後續追蹤;對於公開分享的feature branch,常常搞壞其他人的歷史紀錄。

看了大家的討論,會覺得當初何必多弄一個指令git rebasegit merge著重的點是完整記錄所有的歷史;而git rebase則讓歷史紀錄更容易閱讀。對專案管理者來說,或許git rebase有比較大的誘因,因為歷史紀錄容易閱讀,也很容易產生進度報告及報表。但是對於開發者來說,最重要的反而是詳盡的歷史紀錄,H4的Thinker說:
你要誠實面對你自己
其實在這之前我並沒有深入想過這個問題,純粹只是覺得git merge就夠用了,而且常聽說git rebase要小心使用。因此在Github上貢獻專案時就會想說避免闖禍,先用git merge就好。認真瞭解後才發現這真是一個有趣的問題,為了保留完整的歷史紀錄,現在可以更有信心的使用git merge了~

2015-06-14

[Clean Code] 閱讀筆記 - 2

第二章討論的主題是有意義的命名(Meaningful Names)。命名真的是很大的學問,看看The 9 hardest things programmers have to do,第一名就是“命名”,就連母語是英語的人,也會覺得寫程式的時候命名很困難。

命名可以分成兩個面向,一個是意義(meaning);另一個則是樣式(style)。意義是很主觀的,會因為變數或函式的功能不同而改變命名的用字。Clean Code這本書將命名的重點集中在如何使命名有意義,但是變數或函式的命名會隨著使用的時機及其存在的範圍(scope)而有所調整,因此很難有一固定的命名方法。作者列出一些參考的方向,有好的也有不好的。例如dayd容易了解,而finishDay又比day描述得更清楚;但是用finishDay或是completeDay則需是使用狀況而定,很難說哪一個比較好。我認為有意義的命名就像寫文章時的用字遣詞一樣,要表達同一個意思,有人用字精準,也有人喜歡華麗的辭藻,但是重點就是要讓人看得懂,而不只是一堆文字的堆疊,這樣的好處是讓閱讀的人可以更快看懂程式碼。

Mosky五月在Taipei.py的演講Beyond the Style Guides,也跟有意義的命名有關。

命名的樣式是屬於比較客觀的部分,主要用於定義名稱的結構。例如同樣是指account data,可以有很多寫法:accountdata, AccountData, accountData, account_data。很多團隊會事先定義好固定的樣式,希望大家遵守,只是為了閱讀習慣。Python PEP8也提供了命名樣式的規則讓大家參考。

另外之前一直以為匈牙利標誌法(Hungarian Notation, HN)包含CamelCase的規則,後來才發現是自己誤會了。匈牙利標誌法是指在各個變數之前,加上資料型別的小寫字母縮寫,讓閱讀程式碼的人得以快速知道該變數的資料型別。寫程式會常用到Windows C API的人應該知道,MSDN上的API說明,每個變數前面都有一串資料型別,例如DeviceIoControl

看完第二章的感想是除了讓命名有意義外,我覺得一致性也很重要,命名的一致性可以降低閱讀的人的疑惑。書上也舉例:manager, controller, driver這三者的目的是類似的,如果沒有特別的差異不要混用,否則會讓看的人更困擾。看懂別人寫的程式碼已經有點困難了,還要去猜這裡的controller跟那裡的driver有什麼不一樣,那就太浪費時間了。

@classmethod and @staticmethod in Python

In C++ or Java, the static identifier used in a Class method is to declare the function to be bound to that Class. This are also known as class methods and you could use them in class level without initiating an instance first. In Python, however, the ideas of static method and class method are different. Take the following Python code as example:

class MethodDemo(object):
    class_number = 100

    def __init__(self):
        self.number = 0

    def Show(*arg):
        print "instance method", arg

    @classmethod
    def ClassShow(*arg):
        print "class method", arg

    @staticmethod
    def StaticShow(*arg):
        print "static method", arg

def main():
    d = MethodDemo()
    d.Show()
    d.ClassShow()  # same as MethodDemo.ClassShow()
    d.StaticShow() # same as MethodDemo.StaticShow()

if __name__ == '__main__':
    main()

And the result on my computer is:

instance method (<__main__.MethodDemo instance at 0x10cef93f8>,)
class method (<class __main__.MethodDemo at 0x10ceb3ae0>,)
static method ()

The first argument in instance method Show() is the instance object of that class (bound to instance). Usually this kind of method is defined as "def Show(self):", and self is instance argument. By passing instance itself to a instance method, other instance methods or instance variables, such as self.number, are accessible.

If decorator @classmethod is used to define a method then that method is a class method. From above code snippet, the first argument to be passed into a class method is the class object (bound to class). As a result, from a class method, you could access other class methods or class variables, such as class_number.

If decorator @staticmethod is used to define a method then that method is a static method. There is no arguments to be passed into a static method by default (bound to nothing). It is like a normal function outside a class in global scope. The usage of static methods are like global functions and the class name will be namespace to solve the conflict of same function names in global scope.

2015-06-13

[Clean Code] 閱讀筆記 - 1

最近參加一個讀書會,討論Clean Code -- 無瑕的程式碼(博客來連結),雖然念中文書比較輕鬆,但是有些翻譯的部分讓人很想看看原文,所以又買了原文版的來讀,除了分享讀書的筆記外,順便分享中英文的內容。

書本第2頁,程式碼將一直存在的小節中第四段,
某些傢伙認為程式碼總有一天會消失,就像期望數學不必太正式的數學家。
數學家那一句話實在無法理解,找出原文如下:
The folks who think that code will one day disappear are like mathematicians who hope one day to discover a mathematics that does not have to be formal.
這裡formal翻譯成正式的似乎怪怪的。根據formal第19的解釋,意思應該是經過嚴謹證明的

第一章屬於理念的闡釋,作者訪問許多大師對於clean code的看法,在此列出這些大師所說內容的原文。

Bjarne Stroustrup, inventor of C++ and author of The C++ Programming Language
I like my code to be elegant and efficient. The logic should be straightforward to make it hard for bugs to hide, the dependencies minimal to ease maintenance, error handling complete according to an articulated strategy, and performance close to optimal so as not to temp people to make the code messy with unprincipled optimizations. Clean code does one thing well.


Grady Booch, author of Object Oriented Analysis and Design with Applications
Clean code is simple and direct. Clean code reads like well-written prose. Clean code never obscures the designer's intent but rather is full of crisp abstractions and straightforward lines of control.


"Big" Dave Thomas, founder of OTI, godfather of the Eclipse strategy
Clean code can be read, and enhanced by a developer other than its original author. It has unit and acceptance tests. It has meaningful names. It provides one way rather than many ways for doing one thing. It has minimal dependencies, which are explicitly defined, and provides a clear and minimal API. Code should be literate since depending on the language, not all necessary information can be expressed clearly in code alone.


Michael Feathers, author of Working Effectively with Legacy Code
I could list all of the qualities that I notice in clean code, but there is one overarching quality that leads to all of them. Clean code always looks like it was written by someone who cares. There is nothing obvious that you can do to make it better. All of those things were thought about by the code's author, and if you try to imagine improvements, you're led back to where you are, sitting in appreciation of the code someone left for you--code left by someone who cares deeply about the craft.


Ron Jeffries, author of Extreme Programming Installed and Extreme Programming Adventures in C#
In recent years I begin, and nearly end, with Bech's rules of simple code. In priority order, simple code:
* Runs all the tests;
* Contains no duplication;
* Express all the design ideas that are in the system;
* Minimizes the number of entities such as classes, methods, functions, and the like.

Of these, I focus mostly on duplication. When the same thing is done over and over, it's a sign that there is an idea in our mind that is not well represented in the code. I try to figure out what it is. Then I try to express that idea more clearly.

Expressiveness to me includes meaningful names, and I am likely to change the names of things several times before I settle in. With modern coding tools such as Eclipse, renaming is quite inexpensive, so it doesn't trouble me to change. Expressiveness goes beyond names, however. I also look at whether an object or method is doing more than one thing. If it's an object, it probably needs to be broken into two or more objects. If it's a method, I will always use the Extract Method refactoring on it, resulting in one method that says more clearly what it does, and some submethods saying how it is done.

Duplication and expressiveness take me a very long way into what I consider clean code, and improving dirty code with just these two things in mind can make a huge difference. There is, however, one other thing that I'm aware of doing, which is a bit harder to explain.

After years of doing this work, it seems to me that all programs are made up of very similar elements. One example is "find things in a collection." Whether we have a database of employee records, or a hash map of keys and values, or an array of items of some kind, we often find ourselves wanting a particular item from that collection. When I find that happening, I will often wrap the particular implementation in a more abstract method or class. That gives me a couple of interesting advantages.

I can implement the functionality now with something simple, say a hash map, but since now all the references to that search are covered by my little abstraction, I can change the implementation any time I want. I can go forward quickly while preserving my ability to change later.

In addition, the collection abstraction often calls my attention to what's "really" going on, and keeps me from running down the path of implementing arbitrary collection behavior when all I really need is a few fairly simple ways of finding what I want.

Reduced duplication, high expressiveness, and early building of simple abstractions. That's what makes clean code for me.


Ward Cunningham, inventor of Wiki, inventor of Fit, convector of eXtreme Programming. Motive force behind Design Patterns. Smalltalk and OO thought leader. The godfather of all those who care about code.
You know you are working on clean code when each routine you read turns out to be pretty much what you expected. You can call it beautiful code when the code also makes it look like the language was made for the problem.

2015-06-10

Embedding Python with multi-thread in C++ application

Initial work flow
1. Application starts from C++ layer
2. C++ layer invokes function in Python layer in main thread
3. The Python layer function in main thread creates an event thread
4. Starts the event thread in Python layer and go back to C++ layer
5. Main loop starts in C++ layer
6. The event thread invokes callback function in C++ layer if needed

From the beginning, the event thread works unexpected. I guess this is due to GIL from the situation I encountered so I tried to solve this from GIL. Here is my solution.

Analysis
First, from note in PyEval_InitThreads,
When only the main thread exists, no GIL operations are needed. ... Therefore, the lock is not created initially. ...
So if multi-thread is needed, PyEval_InitThreads() must be called in main thread. And I call PyEval_InitThreads() before Py_Initialize(). Now GIL is initialized and main thread acquires GIL.

Second, each time before Python function is invoked from C++ layer, PyGILState_Ensure() is called to get GIL. In addition, after Python function is invoked, PyGILState_Release(state) is called to go back to previous GIL state. As a result, before step 2, PyGILState_Ensure() is called, and after step 4, PyGILState_Release(state) is called.

But there is a problem. From PyGILState_Ensure and PyGILState_Release, these two functions are to save current GIL state to get GIL and restore previous GIL state to release GIL. However, after calling PyEval_InitThreads() in main thread, main thread owns GIL definitely. And the GIL state in main thread is as follows:

/* main thread owns GIL by PyEval_InitThreads */
state = PyGILState_Ensure();
/* main thread owns GIL by PyGILState_Ensure */
...
/* invoke Python function */
...
PyGILState_Release(state);
/* main thread owns GIL due to go back to previous state */

From above code sample, main thread always owns GIL so the event thread never runs. To overcome this situation, let main thread not acquire GIL before calling PyGILState_Ensure(). Therefore, after calling PyGILState_Release(state), main thread could release GIL to let event thread run. So GIL should be released in main thread immediately when GIL is initialized.

Here PyEval_SaveThread() is used. From PyEval_SaveThread,
Release the global interpreter lock (if it has been created and thread support is enabled) and reset the thread state to NULL, ...
By doing so, embedding Python with multi-thread works.

Work flow after modification
1. Application starts from C++ layer
2. PyEval_InitThreads(); to enable multi-thread
3. save = PyEval_SaveThread(); to release GIL in main thread
4. state = PyGILState_Ensure(); to acquire GIL in main thread
5. C++ layer invokes function in Python layer in main thread
6. The Python layer function in main thread creates an event thread
7. Starts the event thread in Python layer and go back to C++ layer
8. PyGILState_Release(state); to release GIL in main thread
9. Main loop starts in C++ layer
10. The event thread invokes callback function in C++ layer if needed

2015-06-04

2015-05-02

Home Jukebox with Raspberry Pi



Although iTunes, Spotify or YouTube lets me listen music everywhere and has bunches of music, I missed my Jazz collections ripped from CDs before. If you have a collection of digital music files, such as MP3, and a Raspberry Pi, you actually have a home jukebox. My plan is to run a music server on Raspberry Pi with speakers connected to it. And I would like to control the jukebox from my phone via wifi. Awesome, right? Let me show you how to do this.

Hardware
Raspberry Pi, Power supply for Raspberry Pi, SD card, Wifi dongle, USB storage with digital music files, Speakers, and Audio connector.

Software
Operating system for Raspberry Pi (here), Mopidy (music server, here) and Mopidy-MusicBox-Webclient (web extension for Mopidy, here)


First, connect speaker, USB stick with music files and wifi dongle to Raspberry Pi. Prepare operating system on SD card for running the system. After successfully booting on Raspberry Pi, setup wifi dongle and set auto mount the USB stick. Then install Mopidy and the extension with instructions from Mopidy documentation (here).


Mopidy is a great music server written in Python with two major services, MPD and HTTP server. Mopidy is also extensible with Python API, JavaScript API and JSON-RPC. Other contributors make useful extensions to play music from Spotify, SoundCloud and even Google Music and YouTube. Mopidy-MusicBox-Webclient is a pure web client written in JavaScript to control music playback on Mopidy. Because I want to control my jukebox from my phone or laptop, web client is an easy way. I also contribute to Mopidy-MusicBox-Webclient on Github (here).

MPD is abbreviation of Music Player Daemon with music playback functions and controller functions (here). Extending MPD directly is also possible, but using Mopidy is easier.

If all modules are correctly set up, connect to Raspberry Pi from URL like http://address/to/raspberrypi:6680/musicbox_webclient and you will see the control page like this:


Now you could have fun with your own jukebox at home!

2015-04-13

R.app warning on Mac

If install R.app on a Mac without default locale setting to UTF-8, warning message comes up while starting up R.app. The warning message is as follows:


To solve this issue, close R.app and type following command in terminal.
$> defaults write org.R-project.R force.LANG en_US.UTF-8
Relaunch R.app and the warning disappears.

Reference
[1] http://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#Internationalization-of-the-R_002eapp

2015-04-06

深奧Whisky

最近迷上whisky,買了幾瓶酒跟幾本書,開始研究起whisky的奧妙。以前不懂,總覺得whisky好辣好嗆,根本喝不出什麼味道;拜喝咖啡所賜,對於味道的分辨有進步,除了高濃度酒精的嗆辣,多少可以喝出其他的味道。

看了書的介紹,才發現whisky也是博大精深,今天看到幾個酒標上的名詞解釋,終於解開心中的疑惑。

Malt Whisky
麥芽whisky,原料為大麥,將大麥的種子浸泡於水中,等待發芽;接著經過糖化及發酵,再經過蒸餾及熟成,最後裝瓶所得的酒即為whisky。

Single Malt Whisky
由單一蒸餾廠所生產的whisky稱為single malt whisky。從原料到製程都是一致的,因此single malt whisky會有該蒸餾廠獨特的味道。

Scotch Whisky
只有在蘇格蘭製造且熟成三年以上的whisky可稱為Scotch whisky,其他地區生產的whisky不能冠上Scotch的名稱。

Single Malt Scotch Whisky
在蘇格蘭生產的single malt whisky。

Single Cask
在裝瓶的過程中,只取單一熟成酒桶的原酒來裝瓶的whisky稱為single cask。一般而言,single malt在裝瓶時會挑選同一批但是不同酒桶的原酒來裝瓶,以平衡不同酒桶之間產生的差異。因此single cask不只會有該蒸餾廠獨特的味道,還會增添不同酒桶的獨特性。

Vatted Malt
與single malt類似,都是純麥,差異為裝瓶時不只取用單一蒸餾廠的原酒,同時取用附近地區多個蒸餾廠的原酒裝瓶,這樣的whisky表現的味道為該地區的特色而不只是單一蒸餾廠的特色。

Blended Malt
這是一個容易混淆的名詞。大致上與vatted malt同義,但是太容易造成混淆,比較少用。Malt whisky在取得原酒裝瓶的範圍由大至小為:vatted whisky > single malt whisky > single cask whisky。

Grain Whisky
不使用麥芽為原料但是製程與malt whisky相同所生產的whisky稱為grain whisky,常用的原料有玉米及裸麥。

Blended Scotch Whisky
由malt whisky及grain whisky為原酒,並且在蘇格蘭調和製成的whisky即為blended scotch whisky。

Whisk(e)y
Whisky與Whiskey是相同的,一般的誤解是美式英文的拼法與英式英文拼法的不同。但事實上是指whisky類型的不同,蘇格蘭及加拿大生產的會用whisky;而愛爾蘭生產的會用whiskey;美國生產的多用whiskey,但有的廠牌也使用whisky。

Reference
[1] Michael Jackson's Malt Whisky Companion 6th Edition
[2] 漫畫威士忌入門 (古谷三敏)

2015-04-02

Color terms

Additive color
Color created by mixing a number of different light colors. Adding red, green and blue together forms like sun light so the color becomes white.


Subtractive color
Color created by mixing dyes or inks. The color of dyes or inks is the only visible spectrum reflecting to eyes and other visible spectrum are absorbed. Hence the color mixed by red, green and blue together absorbs all visible spectrum and the color is black.


Hue
Map colors to a cyclic color wheel.

Saturation
Saturation is one type of colorfulness. The saturation of a color is determined by a combination of light intensity and how much it is distributed across the spectrum of different wavelengths.

HSL and HSV
HSL (hue-saturation-lightness) and HSV (hue-saturation-value) are cylindrical-coordinate representations of points in an RGB color model.

Chromaticity
An objective specification of the quality of a color regardless of its luminance. Chromaticity consists of two independent parameters, often specified as hue (h) and colorfulness (s).

CIE color space
Human eye has three kinds of cone cells, which sense light, with spectral sensitivity peaks in short (S, 420-440nm), middle (M, 530-540nm) and long (L, 560-580nm) wavelengths. Three parameters (S, M, L), corresponding to levels of stimulus of the three types of cone cells, can in principle describe any color sensation, called LMS color space.

A color space maps a range of physical produced colors to an objective description of color sensations registered in the eye, typically in terms of tristimulus values. The tristimulus values associate with a color space can be conceptualized as amounts of three primary colors in a tri-chromatic additive color model.

Most wavelengths will not stimulate only one type of cone cell due to overlap of spectral sensitivity curves. LMS tristimulus values for pure spectral colors would imply negative values for at least one of these three primary colors. To avoid negative RBG values and to have one component describing brightness, "imaginary" primary colors and corresponding color-matching functions have formulated. The resulting tristimulus values are X, Y, Z (XYZ color space).

Humans tend to perceive light within green parts of the spectrum as brighter than red or blue. The luminosity function that describes the perceived brightness of different wavelengths is thus roughly analogous to M.

CIE model defines Y as luminance, Z is quasi-equal to blue stimulation (S cone response), and X is a mix of cone response curves chosen to be non-negative.

The chromaticity of a color is specified by two derived parameters x and y, two of the three normalized values which are functions of all three tristimulus values, X, Y and Z.

The derived color space specified by x, y and Y is known as xyY color space and is widely used to specify colors in practice.

CIE Chromaticity Diagram
The outer curved boundary is the spectral locus, with wavelengths in nanometers. This diagram represents all of the chromaticities visible to average person. All visible chromaticities correspond to non-negative values of x, y and z.

Reference
[1] https://en.wikipedia.org/wiki/Additive_color
[2] https://en.wikipedia.org/wiki/Subtractive_color
[3] https://en.wikipedia.org/wiki/Hue
[4] https://en.wikipedia.org/wiki/Colorfulness
[5] https://en.wikipedia.org/wiki/HSL_and_HSV
[6] https://en.wikipedia.org/wiki/Chromaticity
[7] https://graphics.stanford.edu/courses/cs178-10/applets/threedgamut.html
[8] https://en.wikipedia.org/wiki/CIE_1931_color_space
[9] https://en.wikipedia.org/wiki/Color_space