RSS Feed

‘strategy’ Category

  1. The basics of a good comment system

    May 22, 2015 by max

    Comments are everywhere. From facebook and reddit to your local newspaper. Yet a lot of sites (cough… newspapers… cough) seem unable to maintain comment sections of an acceptable quality. Looking at it from the perspective of someone that has followed the development since Slashdot started taking comments seriously in 1997 a short guide may be in order.

    This post will outline the basics of a good comment system. Note that since this is a vast field some finer nuances may be omitted, and my personal opinion and preferences will probably shine through. These are just the basics. Also, I won’t get into technical implementations since that would be too much for one blogpost, and will be dependent on which language/framework is used.

    Also note that this guide applies to sites that have many users and many comments. If you  have a small blog where each post gets 5-10 comments you should just use a standard comment system, or maybe disqus.

    So let’s get started.

    The basics

    A good comment system consists of three parts:

    1. User profiles. This may seem obvious, yet I still see sites that don’t have it. The profile is a users identity when he is commenting on a site.
    2. An upvoting/downvoting mechanism. There are many implementations, among them are  facebook likes, Reddits up and downvotes and slashdots dropdown choices between interesting/insightful/funny//informative
    3. A sorting algorithm that will sort the comments based on input from user profiles and the upvoting/downvoting mechanism. This is the vital part that sorts the quality comments from the inevitable trolls, me-too posts and conspiracy nuts.

    Let’s look a bit more closely at each of these three components.

    User profiles

    The primary function of the user profile is, of course, to identify the user. But it has several other uses that are just as essential

    • Giving the user an identity. The more of an identity a user has on a given website the more he will feel a part of a community, adhere to rules and netiquette and ultimately write better comments.
    • Identifying good and bad citizens. Users that write good comments will often do so consistently, users that write bad comments will often do so consistently too. This can be used in the algorithmic placement of comments.
    • The ability to “get to know” other users. A comment stating that something is completely wrong may either be very insightful, very stupid, or just trolling – it depends on the context and to a large degree the user. Being able to recognise the user, look through his comment history, and maybe see his profile adds a lot of context and value.
    A voting mechanism

    The voting mechanism allows others to vote on a users contributions. This is the primary input for the voting algorithm in assessing how valuable a comment is – a comment with a lot of positive votes is inevitably more valuable than one with none. Voting also serves two subtle psychological purposes. One is to allow users to easily approve or disapprove of a comment, and the other is to give some (hopefully positive) feedback to the commenter. Both help user retention and the feeling of being part of a community.

    There are a number of different implementations to choose from.

    • likes. Probably the most known and versatile voting mechanism popularised by facebook. It has the intrinsic advantage that it is cognitively easy to parse – even your mom knows what it means to like something. Pressing the thumbs-up icon is done millions of times every day by non-technical users. It’s a safe bet, and probably the right thing to use if your demographic is old or not net savvy.
    • Upvotes/downvotes. The up and down arrows popularised by first Digg and then Reddit are for a slightly more tech-savvy crowd. An upvote and a like are of course technically the same, but the psychology behind them is slightly different. A like is exactly what it says, whereas an upvote can mean “I like this”, “this is a worthy and interesting comment”, “this adds to the conversation” or something else. It depends on the site, and it is not trivial to convey to users what an upvote means. Some sites, such as hacker news have a stated set of guidelines that users (surprisingly!) adhere to, whereas on a site like reddit an upvote means different things whether you’re in the “askhistorians” subreddit or the “aww” subreddit. Downvotes are of course the alter ego of upvotes, but you need to think about the psychology behind them before you implement them. Upvotes are a positive acknowledgement, downvotes are a negative acknowledgement which may deter users from coming back.
    A sorting algorithm

    One of the sad facts about the Internet is that 90% of what people write is crap. 99% if you set the bar high, 80% if you’re an optimist. This applies to comments too, and it means that without some intervention a reader is forced to read through 10 crappy comments before reading a good one. Most people don’t have the time for that.

    That’s why sorting comments is important.

    The ultimate goal is to present the reader with the good comments, and allowing him to skip over the bad ones. The goal of a sorting algorithm is to find these nuggets, and the job of the UX people is to present the nuggets to readers in the best way possible. Note that these are 2 different things. The algorithm calculates a score, and the design presents the comments to the user based on this.

    Most algorithmic sorting systems are primarily based on other users votes, but presenting only the comments with the most votes to other users presents a problem;  How will new comments gain votes? What if comment number 100 is incredibly insightful, but gets no votes because no one reads through 100 commetns before they see it? The way this is solved is to create an algorithm that makes sure that comments with many votes rise to the top, but also makes sure that new comments are seen and have a chance to get voted on.

    Probably the most simple version, that works surprisingly well, is Hacker News. The algorithm is as follows:

    Score = (votes-1) / (time since creation in hours+2)^1.8

    If you’re mathematically inclined you’ll see that votes add to the score and time subtracts from the score. Since comments are listed according to score this means that new comments start at or near the top, allowing other users to see them and vote on them, but quickly fall down the page if they receive no upvotes. Thus the playing field is more level, and late comments still have a chance to rise to the top.

    Reddit’s sorting algorithm works on the same principle of presenting users with a list of comments sorted by score, but the score is calculated somewhat differently. It uses Wilsons score interval, an algorithm developed by Edward B. Wilson in 1927(!). The idea is that you sample each comment when it is voted on, and give it a score. It’s basically like polling each comment when a vote is cast on it. The comment system is created by Randall Munroe of XKCD fame, and he has written a very readable blogpost about how it works here

    Amix.dk has a good run through of both Reddit’s and *Hacker news‘ algorithms.

    Facebook’s sorting algorithm is complex, often changing and a well kept secret – so it’s hard to say something meaningful about how it works. At least something meaningful that doesn’t change in a week.

    The old-timer slashdot solves the problem somewhat differently. The comments are listed chronologically, but the ones that receive few or no votes are hidden from view, and require an active click to view. Since their voting system is a dropdown of insightful/informative/interesting/funny you can choose to sort by one of these if you just want to see the funny comments. Or the insightful ones. The advantage of this solution is that it keeps the chronological nature of the comment section intact, while still presenting only the best comments to the user.

    Note that the above sorting algorithms are just the basics, and that you can, and probably should, add and experiment to get it right. Maybe you should include users average comment score in the algorithm, maybe you should add a negative weight to new users, maybe votes from moderators should count double. The possibilities are endless. This is also why it’s important to keep the sorting algorithmic separate in your code base so you can continue to tweak and perfect it.

    Moderation

    If you have a reasonable amount of comments you need moderation. There will always be trolls, personal attacks, haters and just assholes and you need to do something about them because they will infest your community and drive the good users away if you don’t. Nobody wants to spend time writing a thoughtful comment that will be lost in a sea of swearwords, illuminati conspiracies and presumptuous premises. This is a cumulative effect; Once you start having bad comments (for some definition of bad, that obviously depend on your community) they will attract more. The same goes for good comments. This is why moderation is important.

    Good moderation is a combination of human and machine effort. The most blatant spam can be caught using standard techniques such as bayesian filtering, but reasoning about the validity of comments above a very low threshold is still beyond algorithms. There are a few different technques that can be employed:

    Algorithmic sorting

    The voting algorithm will get you a long way, especially if you have downvotes. Comments that have a sizable amount of downvotes can automatically sink to the bottom, where few people read them. Hacker News has a rather clever system where the text-color of a comment  gets closer and closer to the background color the more downvotes it receives. After enough downvotes it is invisible.

    An additional measure is a “report spam” button that lets users report spam comments. This is useful, since it’s  clear indication that when a user presses it it is because he thinks a comment is spam. The system should, however, not just delete the comment since this is an easy way to cheat the system and remove comments that you disagree with. Instead a system should be employed where a report button is incorporated into the moderation system, such that the action taken is based on a more nuanced set of parameters. These could include the reporting users previous posts, average score, or time since creation, it could include the same parameters from the writer of the comment, and it could send a message to the moderators. Bringing us to…

    Moderators

    Moderators are the humans that make sure everything works as it is supposed to. These can either be paid moderators, which quickly gets expensive, or it can be powerusers that volunteer. Typically a hierarchy is employed with paid staff at the top that have a number of volunteers below them. The job of he paid staff is to find and keep good moderators, tweak algorithms and do normal housekeeping. The job of the volunteers is to moderate comments. One important reason for having volunteer moderators is to have a better response time. If moderation is only done by normal employees response times for commenting is typically slow, both because people have other things to do, and because there typically will be no moderation after working hours. A well-kept volunteer based system on he other hand will have almost instant moderation.

    banning

    Some users just won’t learn. Maybe they are trolls, maybe they have a personal agenda, or maybe they just have nothing better to do. To have a well functioning community you need to get rid of them since they can quickly infest and degrade your comment section. Banning can either be automatic, or done by moderators, and a ban can either be on the userprofile (with the disadvantage that he can just create another) or IP adress (with he disadvantage that others from that IP can’t join the discussion, and the problems with dynamic IP’s). There is no proven way to completely ban a user and make sure he doesn’t come back, short of requiring personal ID which is probably taking it a bit too far. For most sites it’s a whack-a-mole game, but the more effective you are at weeding out, the smaller the problem becomes as bad users find out that their comments won’t be read anyway.

    A clever way of keeping bad users in a trap is hell-banning – they will see their comments on the site, but they will be invisible to everyone else. Often they don’t realise this, and wonder why their snarky comment doesn’t trigger a response, not realising that they are the only ones to see it. Eventually they will get tired and go somewhere else. A particular insidious version of hell-banning is to let hell-banned users see comments from other hell-banned users.

    Transparency

    Experience suggests that at least some transparency is important for a good community. If you just delete comments users are prone to start speculating and eventually get angry. Conspiracy theories about the political bias of moderators, personal agendas and the like are bound to pop up. So are comments about it, and they typically don’t add to the conversation. A good start is a set of guidelines, that state what is and isn’t allowed. Being able to contact moderators is another good measure. Flicking a switch that allows users to see deleted comments is another good way. Sending an automated message with the guidelines is another.  Just deleting comments with no reason is a bad idea unless it’s obviously spam.

    Some sites such as Hacker News choose to keep the identity of moderators secret (or at least not publicly available) whereas sites such as reddit has visible moderators for each subreddit that are free for all to see. Slashdot employs a unique system where some users are granted moderation abilities for short timeperiods based on their past acions. This approach crowdsources the moderation to all users, and may be more fair and has the advantage that there is not the the potential for one moderator with a political agenda, a personal vendetta or other non-desirable behavior.

    Design and usability

    Design and usability are important factors. You should strive for a system that makes it easy for new users to join the conversation and if you have the resources give advanced possibilities to advanced users.

    The sign-up proccess

    The sign-up process should be easy and hassle-free; username and password and maybe e-mail should really be enough. Full name, number of pets, where you are from and sexual orientation is just filler that will drive new users away. I have seen some sites try to use the sign up process to minimise spam comments by requiring phone numbers or real ID’s. I have seen no data to suggest that this works. If your strategy for minimising spam and bad comments is to make it harder to sign up you’re doing it wrong. Facebook is an exception here – the only reason it works is because they have massive network advantages.

    Writing and reading comments

    Writing a comment should be easy. Again, making it hard or limiting users possibilities doesn’t help much against bad comments, but it definitely hinders good comments. This is also the wrong place for moderation. Most well functioning comments seem to have some kind of markdown, so that users can style their comments. This is a big win for longer comments, that otherwise would just be a wall of text. Typically styling is limited to simple things such as bold, indented, headings, links and unordered lists. Not much, but enough to make a long comment readable. It’s not an absolute must, but with all the free markdown editors availabel it’s an easy implementation. I have seen some sites limit comments to 500 or 1000 characters, and I’m pretty sure this is a terrible idea. You end up with complaints, comments in 2 or 3 parts, and noone writing thoughtful comments without any apparent upside.

    Anonymous posting may have its merits if the conversation is fickle and involves whistleblowing, sexual orientation, personal problems, or a number of other subjects. Typically users will create a throwaway account that will only be used for one comment thread. In my experience some of these anonymous postings are incredibly interesting  because they touch on subjects that are normally taboo in one way or another. Slashdot has an interesting twist on snonymous posting; When you are logged in you can choose to post anonymously, and your comment will appear with the username “anonymous coward” and get an automatic penalty in the voting system to keep anonymous spam and personal attacks near the bottom.

    The discussion between linear (one long list) and threaded (hierarchical, like folder views) comments has been ongoing since newsgroups was the hot thing. The advantage of linear comments is that they are easier to understand for non-technical users, but they are harder to parse for more savvy users. Particularly conversations are a problem for linear comments. Replying to another user is a mess, following a conversation is even more of a mess. Threaded conversations seem to be prevailing as more and more users get used to them. It’s also hard not to notice that almost all well-functioning comments sections have some kind of threaded comment system. On a sidenote I’ve more than once heard the argument that threaded comments with unlimited depth were almost impossible to implement. I suggest these people learn about recursion

    Interaction and psychology

    Why do people spend their time writing comments? To paint with a really broad brush it’s either because they are bored, have an agenda, are angry or have something interesting to say. The ones that have something interesting to say are usually the ones that are most busy, and have the lowest threshold for making a comment. For this reason it should be simple and quick to make a comment. The downside is that it will also be simple and quick for users that don’t have anything to add, but that’s what comment sorting and moderation is for. Making it hard to join the conversation is throwing the baby out with the bathwater.

    Actually there’s another reason people spend their time writing comments, and it’s probably the most important one; to feel part of a community, and to get a feeling of acceptance or empowerment from that community. This is why feedback is important. Facebook is the master of this. We all know and love the little red globe on the top right of the page that indicates that someone has liked or responded to something we wrote. As any psychologist can tell you this brings you closer to the community, and gives you a more favorable view of the site. It also promotes discussion since a user is notified when someone responds to his comment. You absolutely need to have functionality that easily lets users see responses on their writings – it’s one of the major psychological drivers for spending time writing out a long thoughtful comment.

    Karma is the word normally used for votes/points/likes. The more votes you have the more karma you have. It’s a disputed term that many people have a love/hate relationship with, but it works. Most power users on a given site with karma will follow it, and most people won’t acknowledge that they do so. It’s a measure of how good a member of the community you are, or as a psychologist might say, it is an extension of your ego. Even though it’s just a number it has a profound psychological effect, and spurs users to write better comments to gain karma. Some sites even have top lists of users with the most karma.

    Closing thoughts

    What was originally intended to be a short guide to comments for noobs ended up being much longer than I thought, and I’ve only covered the basics. This probably goes to show that comments are somewhat more complicated than they first appear, and that a good implementation is not trivial.

    Best of luck to anyone faced with the job of implementing a good comment system.

    If you think the task is monumental and don’t know where to start, you should send me an e-mail – if you have an interesting project I might be interested.


  2. Why newspapers are dying

    November 30, 2013 by max

    20 years ago newspapers were thriving businesses. They set the political agenda, shaped public opinion and uncovered large scandals. They also made lots of money. Most publishing houses had large headquarters placed in the most expensive parts of town as monuments to their fortune.

    From a business point they were in an enviable position. They were largely the gatekeepers between information and the public. If you wanted information you had to buy a newspaper. Gatekeepers typically make lots of money because they operate in a monopoly-like environment. If you want something you have to get it from the gatekeeper. Record labels used to be gatekeepers for music – if you wanted to listen to music you had to buy one of their CD’s. Hollywood studios used to be gatekeepers for movies – if you wanted to watch a movie they decided when it would be released in you country, when it would be on television and when you could rent it in blockbusters. They all made lots of money.

    Newspapers used to be gatekeepers for news. In a world without Internet this gave them a distinct business advantage. They had access to millions of readers. In the newspaper industry it’s no secret that actually producing news is a loss leader for the business that can be generated through the access to a large loyal audience that trusts the brand of the newspaper. A newspaper with only news can’t generate enough income to be sustainable.

    But that didn’t matter because seen from an business perspective news was the vehicle that allowed newspapers to bring their other business model to market.

    If you wanted to sell a house you paid the newspaper to be included in the homes section. If you wanted to fill a position in your company you paid the newspaper for an ad in the jobs section. On top of this came the normal ads. They made lots of money. The sunday papers were the size of phonebooks, filled with expensive ads, job listings and pictures of real estate for sale. They also wielded some serious political power since they controlled the flow of information to the public.

    Washington Post headquarters

    Washington Post headquarters – now it’s for sale.

     

    Then the Internet came along and ruined it all.

    With the rise of the Internet the newspapers gatekeeper role diminished. Slowly web-based services crept in and ate the newspapers lunch. Monster, linkedin and a load of other sites slowly ate the revenue generated from the job boards.  Yahoo real estate, zillow and a large number of smaller sites stole the income from the real estate listings. Craigslist stole the classifieds. On and on it went. A million smaller specialised companies eating away at the newspapers economic foundation.

    The sunday paper was slowly reduced from a cashcow the size of a phonebook to a trickle of pennies from a few loyal customers who still thought that a job posting in the New York Times was a pretty good deal. This started happening 15 years ago, without the newspapers taking notice. Or at least it appears that way since none of them put serious effort into developing competing services.

    Only within the last few years have newspapers started to take the Internet threat seriously. Years after it has taken away their businessmodel. A lot of them seem to think that they just need to convince readers to pay for online news, and then everything will be fine – like in the old days. But it won’t. For several reasons:

    • News was never the sole revenue driver for newspapers. Ads, classifieds, job postings and real estate postings made up a substantial amount of the revenue. Those are mostly gone.
    • Ads are cheaper on the Internet than they were in print. It’s a simple question of supply and demand. 20 years ago there weren’t that many possibilities if you wanted to reach consumers. Newspapers were one of them. This drove prices up. On the Internet the supply is nearly endless and drives prices down. Note that this is true both for online ads and paper ads. (Online ads are a substitute for print ads, and since they’re cheaper a lot of the money previously spent on print ads are now diverted to online ads, thus driving down the demand)
    • News is abundant and free. You can always find your news for free on the Internet. This means that the incentive to pay for your news is dwindling. Very few online publications can generate substantial revenue from online content, mostly in the financial press.
    • Competitors have taken over the lucrative income models such as job postings with dedicated sites that do one thing and do it well.

    So are newspapers doomed?

    Not necessarily, but they have a rough road ahead. And they need to start moving now if they want to be in business 10 years from now.

    There are only 2½ definitive truths in the future of newspaper economics:

    1) It’s a structural change, and the old days won’t come back. Like so many other industries the Internet has disrupted the whole business model and stripped the industry of its gatekeeper role, and thus its income model. This won’t change.

    2) Nobody has the obvious true answer.

    2½) In 10 years there won’t be any printed newspapers. (This isn’t entirely certain, but very likely – technology is fast paced, remember that 7 years ago Apple hadn’t even introduced the Iphone)

    So what should a newspaper do to survive? First, look at what unique advantages it has.

    • Brand value. Most newspapers have a brand that many online businesses would kill for. Who do you trust the most? New York Times or Instagram? Brand value is a key metric for driving sales.
    • An audience. A newspaper typically has a large audience that it connects to on a daily basis.
    • Journalists! The people whose job it is to create interesting and engaging stories that depict the world in which we live. A lot of them do it very well.

    So newspapers definitely have some value, what they don’t have is a way of capitalising it. In a way  newspapers are like startups. In the words of Silicon Valley legend Steve Blank  a startup is an organization formed to search for a repeatable and scalable business model.” The main difference is that newspapers are way ahead of startups in that they already have brand, audience, money and a dedicated staff. All they need is the business model.

    So how do startups find a repeatable and scalable business model? They try a lot of different things and see what sticks. It’s the exception rather than the rule that what a successful startup ends up making money on is what was envisioned in the original business plan. Paypal started as a digital wallet for PDA’s, Hotmail started as an online database business. Google didn’t have any idea how they would eventually make money when they started.

    This is the strategy that many successful startups use:

    • Get an idea and create the simplest implementation that could possibly work. Get it out to customers as soon as possible. In startup speak this is called the minimum viable product. The point is not to have a perfect product but to find out whether it’s something people will pay for. If it shows promise you can always improve it.
    • Continuous deployment: Put your minimum viable product out there and test it. Tweak it, make it better, change it a bit and see what happens. Do this continuously until it works. successful startups often deploy new tweaks multiple times a week.
    • Actionable metrics: If you don’t have metrics you don’t know whether a deployment is a success or not. Metrics can be users, acquisitions. readers or money in the bank.
    • Pivot. If your minimum viable product doesn’t work, even after tweaking and testing, drop it and think of something else. Repeat and rinse until you have a business model that works.

    Newspapers should copy this model. If 2 guys in a garage can make it work so can a news organisation that already has a brand, an audience, journalists and a solid infrastructure.

     

     

     

     

     

     


  3. Where’s your apology Google?

    January 17, 2012 by max

    Four days ago this post appeared on Kenyan company Mocality’s website accusing Google of not only scraping mocality’s database, which is basically the most valuable part of their business, but also calling up the numbers in the scraped database to upsell them a Google site. Even worse, the Google employees claimed that Mocality was under or working with Google. This is certainly unethical, and may very well be illegal – Mocality is getting ready to sue. Mocality uncovered that this seemed to be an international operation involving Google headquarters and call centers in India.

    The plot was unveiled in a rather clever way by Mocality, and the technical breakdown of how they caught Google with their pants down lent the blogpost a lot of credibility.

    After a few hours it was confirmed that Google was involved with this statement from Nelson Mattos of Google:  “We were mortified to learn that a team of people working on a Google project improperly used Mocality’s data and misrepresented our relationship with Mocality to encourage customers to create new websites. We’ve already unreservedly apologised to Mocality. We’re still investigating exactly how this happened, and as soon as we have all the facts, we’ll be taking the appropriate action with the people involved.”

    The story quickly spread and both the original blogpost and the reply from Mattos was top news on sites such as Hacker news, Reddit and Boing boing.  It even made it as far as The economist. This is a big deal. Especially because this behaviour stands in stark contrast to Googles ethos of Don’t be evil.

    Yet now, four days later, that’s all we know. We don’t even know whether the above is an official statement from Google since Nelson Mattos presumably posted this from his personal Google+ account.

    This tells me a few things:

    1. Google is not entirely in control of all of their operations. If this scandal was contained to Kenya maybe it could be written off as a few fraudulent employees or a local manager that went a bit too far to get his bonus, but since it appears that it extends to both India and Googles headquarters in Mountain view there’s someone at a fairly high level that doesn’t have full control of his domain.
    2. Google doesn’t handle PR well. Writing, blogposts about new features, April fools jokes and descriptions of how great the food at Google is  is the easy part. It’s when you have a scandal on your hands that your PR needs to shine. Where’s the damage control? Where’s the communication? We don’t even know if Mattos statement is official. It’s been four days!
    3. Google is becoming a big company, just like so many other big companies. They don’t know what is going on in all divisions, they’re spreading their portfolio thin (A Kenyan online directory for instance…), and they’re losing their original values because of it.
    4. Don’t be evil is not a mantra for Google anymore, it’s become a stale mission statement. Just like the stale mission statements all other big companies have.

     

    When are we going to get an official excuse and explanation? Are we even going to get one?