HCI Class Blog: September 2011

Thursday, September 29, 2011

Blog Post #13- Combining multiple depth cameras and projectors for interactions on, above and between surfaces

Reference

Authors:
Andrew D. Wilson Microsoft Research, Redmond, WA, USA
Hrvoje Benko Microsoft Research, Redmond, WA, USA

Published In:
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary

This paper is another of the technical possibilities type of article. Instead of creating a hypothesis and setting out to test it, they are testing whether a particular type of technology is reasonable to construct and use. For this particular paper the authors are testing a prototype they call, "LightSpace." LightSpace is an interactive room powered by depth cameras. These cameras are able to project onto any surface at a very high rate. Thus allowing such interactions as: a projection of an image onto a user's hand as he walks or any flat surface in the room can be used as a projection board. The system itself handles all projecting and combines the entire worldspace into one reference grid so developers do not have to worry with implementation details such as which camera is projecting what. The idea of projecting onto the human body in this paper is called, "Simulated Surfaces." With this, ListSpace keeps a vague mesh of the human bodies within the room, and much like the Microsoft Kinetic, tracks this mesh and allows for easy projection like a menu unto the user's hand or a graphic that the user wants to move from surface to surface.

Overall, the paper was primarily to showcase the various methods of implementation the authors used in constructing their LightSpace room. They also showed how one can project a 3D imagine unto a 2D surface which can then be analyzed by standard image processing techniques.

Discussion

I think the authors accomplished what they set out to do at the beginning of this book. They wanted to outline their methodology of creating LightSpace and I feel like they did a good job of explaining the various implementation details. Moving on, LightSpace is a very interesting piece of technology. Frankly, many people would be interested in this because of their perceptions of the future. Whether it is from books or movies, many people envision computers as an interactive unit like the one portrayed in this paper. An interesting point made by the authors is that it would be completely feasible to emulate the same functionality of a Microsoft Surface onto any flat surface. What would be even more interesting and probably useful would be to allow interaction between the program behind the projectors AND a Microsoft Surface. Since the Surface has such a high resolution compared to the projectors, users might be more comfortable exploring the interactions between the Surface and another Surface or just a board mounted on the wall like in the paper.

Tuesday, September 27, 2011

"Gang Leader for a Day" Review

I went into the book a little sceptical. I had already taken two sociology classes and had not had good experiences with the field. So when I read an except by the author were he claims he is a "rogue sociologist," I was nervous. After finished the book, I am very relieved. He, unlike most sociologist, focused on the actual data or situations of the people rather than turning it into a "bash the opposite political party."

Another interesting point about this book is the location. It takes place in Chicago so I cannot truly be surprised by this level of corruption. It has long been a running joke in popular culture to make fun of Chicagoan politics, but in reality, it isn't funny. It actually a serious matter. Corruption is always present in any form of government, but the level that exists in Chicago is unacceptably high for a city in the USA.

Near the end, I was kind of disappointed by the book. Sudhir felt for some reason that he had to distance himself from JT for no reason. Even going as far as to say he was never his friend. Now, I don't understand how you could hang around someone for seven years, go through some much, and basically make your career off of his life and then not even consider him a "friend." He didn't have to say that he was his bestiest-best-friend of all time, but it is not hard to call someone a friend, and considering all the things JT did for Sudhir, it is just downright cruel to not even give him the label of friend.

Monday, September 26, 2011

Blog Post #11- Multitoe: high-precision interaction with back-projected floors based on high-resolution multi-touch input

Reference Information

Authors:
Thomas Augsten     Hasso Plattner Institute, Potsdam, Germany
Konstantin Kaefer     Hasso Plattner Institute, Potsdam, Germany
René Meusel     Hasso Plattner Institute, Potsdam, Germany
Caroline Fetzer     Hasso Plattner Institute, Potsdam, Germany
Dorian Kanitz     Hasso Plattner Institute, Potsdam, Germany
Thomas Stoff     Hasso Plattner Institute, Potsdam, Germany
Torsten Becker     Hasso Plattner Institute, Potsdam, Germany
Christian Holz     Hasso Plattner Institute, Potsdam, Germany
Patrick Baudisch     Hasso Plattner Institute, Potsdam, Germany

Published In:
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary

The authors of this paper have set out to, in their words, "explore the design decisions," of large high-resolution input devices. Specifically, they explore a system that uses pressure sensing devices to detect minute foot presses on a floor. The system they used is called FTIR. It is able, through the use of a series of pressure sensing devices, to detect any number of pressure points in the system. A big concern of the paper is whether the authors can eliminate unwanted inputs. Since the input system is the floor itself, the users have to stand on it and walk across it to do anything. They are concerned with eliminating these unwanted foot presses and focus only on deliberate foot presses. In order to determine the best method of disregarding unwanted foot presses and highlight intended foot presses, the group created a focus group based on thirty volunteers. The authors created two simulation buttons on the ground and asked the users to show how they would activate one while ignoring the other. After monitoring how they behaved, the authors verbally interviewed the volunteers in order to understand how and why they acted. Through this study, the authors concluded the best method was to make the users tap, double tap, jump, or stomp on a button to cause its activation. Another few studies were performed, each with a different task in mind, examples include: which part of the foot should cause activation, and which point should be considered the "hotspot" or the primary method of activating buttons.

Hypothesis:
There isn't so much as a hypothesis, as there is a question of usability. Is it possible to create a large multi-input floor that can detect such small differences in foot presses that it could tell users apart by the differences of their shoe's soles. Also the question of whether it is feasible to create a system that users can walk on and activate controls comfortably. Ultimately the authors collected enough data from user study groups to form the beginnings of ideas for user input. They have not yet constructed the actual room with the floor based input device, but they have a prototype and detailed construction plans.

Discussion

As always with these discussion, I must be careful to comment on the technical merit of the paper and not just focus on the feasibility/usefulness of the ideas presented. So I will only mention this once: I really don't see the point of this. It has to be implemented into the construction of the room, so home use it out of the window. In commercial use, an actual keyboard will probably be better 95% of the time, I can only see this delegated to cheap, gimmicky games or other advertising ploys. Also as a side note, I have played a LOT of DDR. I mean a lot, and I never, ever liked using the pad to navigate through the menus. Many times my friends or I would just pick up the controller or use the keyboard to make selections and leave the dance pad for the actual game. Even then, playing in arcades was much different than playing at home because the pad the arcade was always worn down from too many users putting too much pressure on them, which is a problem I see for this device.

That aside, I think this technology is pretty interesting. The ability to differentiate users based on their shoe is very intriguing. Typing with your feet seems pretty frustrating through, aside from the length of time it would take, the motions sound like it would wear on you. I think they were spot on with the methods of telling input between just walking, though if I were them, I would not force users to jump to do a simple task like bring up a context menu. This seems ill-suited to older users or less athletic users.

Thursday, September 15, 2011

HCI Initial Write-up

Prior Perceptions

To integrate myself, I plan to take the route of the typical new member. I will show up at the meeting and express my interest to join and also mention my complete lack of experience in this area. I suspect that only half of their new members join like this though. The other half probably have been fencing for a while in high school and might know the people in the club through major fencing tournaments. Regardless, as a newbie in the club and in the field of fencing, I hope the group accepts me and 'shows me the ropes' instead of being annoyed by an unskilled person who tags along with the group.

I think to fully understand a group, one must experience said group without the spectre of being an outsider hanging over them the entire time. Thus, I plan to actually join the group and take the route of a new member instead of introducing myself as having a project and asking to observe them. By being in the inside, I hope I can gain an acute understand of this group, something that would be very difficult as a foreign observer. Honestly, I think the best time and way to bring up the project would just to say that I have to write a few reports over "groups" for a class and let them imagine the rest. They will probably assume it is for a sociology or psychology class and think nothing of it. The knowledge that I had this assignment for a while and that I specifically sought out and joined this club solely for this class would probably be harmful to my membership and friendship in this group. And truthfully, I wanted to join this club anyway. I took the opportunity to complete this assignment and get involved in something that seemed interesting to me and in all honestly, I probably would have joined this club even without this assignment.

As far as prior perceptions that I have with this group... I expect that it will mostly be males. They will probably all be in what most people would call "nerdy" majors. I expect a lot of the people in the club will have been doing this since before they came to A&M. As a result, I assume many will be from upper middle class backgrounds. I am not sure how they will treat new members. The possibility that they are critical and cold towards newer members is there, though they also might not be so elitist and instead welcome new blood with open arms. Of these two possibilities, I cannot know which to be true until I actually attend the meeting.

Initial Results

The initial interaction in the group was extremely satisfying. Not only did I have tons of fun, but I was able to gather a ton of good information. The first thing I noticed my initial expectations about the gender gap was not as wide as I originally envisioned. At the beginning of the meeting, there were about 12 people total, with 4 females. The atmosphere of the group seemed laid back, they were very happy to have new members so that was heartening. The first thing we did once everyone was ready, was warm up exercises. This activity kind of solidified my initial expectations about the kind of members present in this group; that is, everyone struggled with the exercises, basically: this was not a group of hardcore athletes. After the exercises, during the stretches, we all introduced ourselves so the new members (Trevor and me) could get to know everyone and we could introduce ourselves without being awkward. At this point, another assumption was proven true, most of these guys were in either engineering or scientific fields. Next we did footwork drills and since I was new and did not know how to do them, one of the members was nice enough to take me aside and go over them one on one. After these were complete, the rest of the members gathered their equipment and began sparring. The same member who went over the footwork drills with me, continued her kindness and took me to the armory and showed me where all of the equipment is and what items I needed. We got back to the practice room and several people helped me suit up and I was immediately thrust into a sparring lane. The dominate theory of learning in this group seemed to be trial by fire. Armed with a foil and almost no knowledge, I was put to the test against one of the experienced members. At first I was nervous and worried, but these feelings soon faded and I enjoyed myself. Like all good sports, the best way to learn is to do. You don't sit in a classroom or have someone explain how to catch a ball or make a free throw, you just go out there and do it. We spent the rest of the four hour meeting sparring with different members, getting tips and pointers, and generally having fun.

Overall, I felt really welcomed into the group and I expect this to be a very good group to complete my ethography on.

Tuesday, September 13, 2011

Blog #6: TurKit

Reference

Authors:
Greg Little     Massachusettes Institute of Technology, Cambridge, MA, USA
Lydia B. Chilton     University of Washington, Seattle, WA, USA
Max Goldman     Massachusettes Institute of Technology, Cambridge, MA, USA
Robert C. Miller     Massachusettes Institute of Technology, Cambridge, MA, USA

Published in:
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary

This paper is mainly about a concept called, "Mechanical Turk" or (MTurk); a concept where computer programs use humans to do complex mechanical computations that are easy for humans. Examples of this include labelling pictures, reviewing things, and rating things. All of these tasks are either impossible or very difficult to program into a computer but very simple for humans to do. The authors list the features that their program implements: an API for using MTurk with standard programming techniques, a web GUI for managing MTurk scripts, and a new model of computer programming called, "crash and re-run" programming. This new model of programming takes into account the long delay in waiting for human input. A common example of MTurk the authors cite as a success is the online encyclopaedia: Wikipedia. This website builds instead completely from user input and interaction.

The script the authors implement to aid in MTurk programming is entitled, "TurKit Script." It is based on the popular scripting language Javascript but is lightly modified to allow for crash and re-run programming and other MTurk features. Although TurKit is single threaded, it has functions to mimic multi-threaded programs. This is useful because MTurk is inherently a multi-threaded concept since many people will be entering in input at the same time.

Discussion

Yet again, I am completely fascinated about the topic in this paper. The ability to correctly identify images or write acturate reviews of certain items is extremely difficult in terms of computer programming. Farming out difficult work like this to humans who can do this kind of work easily is very smart. I already see this kind of technique used in the ever popular "recaptcha" and I could foresee this being used elsewhere also.

Thursday, September 8, 2011

Blog Paper Reading #4: Gestalt

References

Authors:
Kayur Patel     University of Washington, Seattle, WA, USA
Naomi Bancroft     University of Washington, Seattle, WA, USA
Steven M. Drucker     Microsoft Research, Seattle, WA, USA
James Fogarty     University of Washington, Seattle, WA, USA
Andrew J. Ko     University of Washington, Seattle, WA, USA
James Landay     University of Washington, Seattle, WA, USA

Published in:
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary

This paper is about the development environment: "Gestalt." Gestalt is a program designed to aid in 'machine learning.' Machine learning is the process of designing algorithms that focus on capturing behavioural patterns through the processing of data. The authors recognize the difference between the two tasks of implementing and analyzing data pipelines in their paper and explain how Gestalt differentiates between the two and allows users to switch back and forth between these tasks easily.

The authors explain how many different kinds of problems can be solved using similar general methods with machine learning and Gestalt. Their primary examples are determining whether a movie review is negative or positive by analyzing the vocabulary and grammar of the review and recognizing hand and pen gestures.

For the authors, they conclude that the best method to help develops utilize machine learning is by exposing the entire data pipeline. They reference other machine learning tools that either attempt to simplify or expedite machine learning by hiding some steps in the pipeline from the users, but the authors conclude that this only hinders the developer and thus, they choose to make Gestalt show the entire process.

The rest of the paper goes into detail about Gestalt itself and how it present its data to the users.

Discussion

I thought this paper was actually pretty interesting. At first, the concept of using pure data to drive programming seemed foreign and odd to me, but after reading about the uses I began to see the benefits of such a system. Their overall goal and research method seemed sound to me and I really didn't question any of it. Machine learning seems like such a great idea, I just think it needs to be applied only in certain areas.

Tuesday, September 6, 2011

Blog - Paper Reading #3: Pen + touch = new tools

Reference:

Authors:
Ken Hinckley         Microsoft Research, Redmond, WA, USA
Koji Yatani         Microsoft Research & University of Toronto, Toronto, ON, Canada
Michel Pahud         Microsoft Research, Redmond, WA, USA
Nicole Coddington     Microsoft, Redmond, WA, USA
Jenny Rodenhouse     Microsoft, Redmond, WA, USA
Andy Wilson         Microsoft Research, Redmond, WA, USA
Hrvoje Benko         Microsoft Research, Redmond, WA, USA
Bill Buxton         Microsoft Research, Redmond, WA, USA

Published in:
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary:

This paper considers the applications and benefits of using what the authors call "simultaneous pen+touch" technology which is an interface that allows users to manipulate it via a touch pen and their fingers. They say that most other technologies focus on one of these input methods and ignores or disallows the other. The primary medium they use to discuss this is the Microsoft Surface because like mentioned before, most devices don't allow both pen and touch based input. The authors state it is not their intent to recreate paper and pen onto the Microsoft Surface, but instead, to use the natural movements and gestures inherit in paper and pen input as a base for creating an easy to use. The test users of this system found many of the input gestures and methods very logical and natural to use. The only issues they ran into were the specially designed designed gesture set which had to be explained to them.

Discussion:

As with the last article, this technology is fascinating and I wish the Microsoft Surface was more accessible to the general public. The problem with it, at least to me, is many of the applications are remakes of cheap real life games or could be done with paper and pen. Hopefully this research will help make the Surface easier to use and open up the range of applications that could be developed for it.

Blog - Paper Reading #2: Hands-on math

Reference:

Authors:
Robert Zeleznik     Brown University, Providence, RI, USA
Andrew Bragdon         Brown University, Providence, RI, USA
Ferdi Adeputra         Brown University, Providence, RI, USA
Hsu-Sheng Ko         Brown University, Providence, RI, USA

Published in:
UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Summary:

This paper is written by Mircosoft researchers who are attempting to determine if a Computer Algebra System (CAS) program can integrated into a multi-touch based interface using the Mircosoft Surface. The authors review similar technology and explain why their is new and/or different. Previous works in the same area, according to the authors, do not utilize multi-touch technology combined with a light pen. They claim that using these technologies together allows the user a familiar and natural input method in their program.

The authors spend several pages detailing the various hand gestures and interface options available to users of their program. For instance, page creation is handled using an extremely natural gesture of sliding two fingers over the edge of the screen from the outside to the inside, mimicking the action of pulling a new sheet of paper unto the desktop. According to their results, almost all experimental users were able to figure this out without outside assistance. Another noteworthy interface option is the ability to "fold" pieces of the on screen paper. Much like a computer programming IDE, users can pinch a section of the paper and that particular section is hidden until the user calls it back.

The user feedback in the last portion of their paper seemed pretty optimistic. Most of their interface was discoverable without too much help and the features that were not, were clear to the users after appropriate explanation. The only widespread criticism of the program was the lack of high level functionality it allowed. All of the users desired a deeper program in terms of the level of math it helped with.

Discussion:

The first thing that I cannot shake from my mind is the fact that the authors used the word "cool" multiple times in their article. Although it was a quotation from the feedback of the test users, it still cracked me up everytime I read it. Aside from that, this technology is very "cool." Reading this paper made me want to buy a Microsoft Surface solely for the ability to use program like this (until I looked up the price :/). Overall, I was interested in how the authors decided to use each gesture and input method and as an outsider, I could see how each is natural and a good method to use.

Thursday, September 1, 2011

Blog Post #1- Imaginary Interfaces

Reference Information

Title:
Imaginary interfaces: spatial interaction with empty hands and without visual feedback

Authors:
Sean Gustafson     Hasso Plattner Institute, Potsdam, Germany
Daniel Bierwirth     Hasso Plattner Institute, Potsdam, Germany
Patrick Baudisch     Hasso Plattner Institute, Potsdam, Germany



Summary

This paper is addressed the topic of literally "Imaginary interfaces." What they mean by this is a device that captures hand gestures and motions but provides no feedback in the form of screens, projectors, or anything else. This is a unique concept because most other devices have either a projector or use an extremely limited vocabulary of allowed hand gestures to signal the computer. For example, you might have ten different pre-programmed commands to be able to signal certain actions such as opening a door. But in this paper, the authors go further and suggest a device that allows unlimited options and hand gestures. They propose that the device allow users to set a 2D coordinate plane with one hand then use their other hand to draw on this imaginary screen. The have evidence that the average user's attention span is long enough to remember the location of the things they draw allowing them to use such a device. The Authors also go into detail about the advantages of such a system. Among these are the fact that the user does not need to carry a device allowing them to free up space on their person and in their hands and the recent trend in the use of hand gestures as the primary method of maniuplating interfaces.

Discussion

I am not really sure what to put in my discussion section this time... The idea seems interesting but I am very sceptical about the practical uses. Especially with the interference of random movements, can such a system actually be implemented? Even if it is, people LIKE the ability to see what they are manipulating. I don't think we can turn people away from popular past time of staring at their tiny cell phone screens at all times in public.

Book Reading #1: Chinese Room

Reference Information

Minds, Brains, and Programs

by John Searle

Department of Philosophy

University of California

Summary

John Searle hypothesized an experiment he called "The Chinese Room." It follows as thus, if a programmer created an AI that was so advanced, it could pass the Turing test. Also, what if this program were given the task to receive input in the form of Chinese words and then give logical output also in Chinese, does this AI "understand" Chinese? He comes to his conclusion that it does not "understand" Chinese via another proposed situation. If he went put into a locked room with a Chinese dictionary and a native Chinese speaker was slipping a piece of paper with Chinese on it under the door, and he looked it up and responded, would you conclude he speaks Chinese? He says no, and therefore, the computer does not "understand" Chinese because it is doing the same thing as he is, just faster.

Discussion

I would conclude that Searle's conclusions are false. I would use the same reasoning he did to show that yes, this computer does in fact "understand" Chinese (maybe not very well, but at least to some degree). He says that since he is having to look up the words to be able to communicate he doesn't speak Chinese and since the computer is doing just that, but at a faster rate, it too doesn't "understand" Chinese, but is this not how a human brain works? For the native speaker, he hears the word, then his brain "looks up" what it means and various pieces of information about the word so he can understand its meaning. If a computer could do this at the same speed then what is the difference between a computer doing it or the human brain doing it?