Description
Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper, return the researcher's h-index.
According to the definition of h-index on Wikipedia: The h-index is defined as the maximum value of h such that the given researcher has published at least h papers that have each been cited at least h times.
Example 1:
Input: citations = [3,0,6,1,5] Output: 3 Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.
Example 2:
Input: citations = [1,3,1] Output: 1
Constraints:
n == citations.length1 <= n <= 50000 <= citations[i] <= 1000
Solutions
Solution 1: Sorting
We can sort the array citations in descending order. Then we enumerate the value h from large to small, if there is an h value satisfying citations[h-1] ≥ h, it means that there are at least h papers that have been cited at least h times, just return h directly. If we cannot find such an h value, it means that all the papers have not been cited, return 0.
Time complexity O(n × log n), space complexity O(log n). Here n is the length of the array citations.
class Solution: def hIndex(self, citations: List[int]) -> int: citations.sort(reverse=True) for h in range(len(citations), 0, -1): if citations[h - 1] >= h: return h return 0(code-box)
Solution 2: Counting + Sum
We can use an array cnt of length n+1, where cnt[i] represents the number of papers with the reference count of i. We traverse the array citations and treat the papers with the reference count greater than n as papers with a reference count of n. Then we use the reference count as the index and add 1 to the corresponding element of cnt for each paper. In this way, we have counted the number of papers for each reference count.
Then we enumerate the value h from large to small, and add the element value of cnt with the index of h to the variable s, where s represents the number of papers with a reference count greater than or equal to h. If s ≥ h, it means that at least h papers have been cited at least h times, just return h directly.
Time complexity O(n), space complexity O(n). Here n is the length of the array citations.
class Solution: def hIndex(self, citations: List[int]) -> int: n = len(citations) cnt = [0] * (n + 1) for x in citations: cnt[min(x, n)] += 1 s = 0 for h in range(n, -1, -1): s += cnt[h] if s >= h: return h(code-box)
Solution 3: Binary Search
We notice that if there is a h value that satisfies at least h papers are cited at least h times, then for any h'<h, at least h' papers are cited at least h' times. Therefore, we can use the binary search method to find the largest h such that at least h papers are cited at least h times.
We define the left boundary of binary search l=0 and the right boundary r=n. Each time we take mid = \lfloor l + r + 1⁄2 \rfloor, where \lfloor x \rfloor represents floor x. Then we count the number of elements in array citations that are greater than or equal to mid, and denote it as s. If s ≥ mid, it means that at least mid papers are cited at least mid times. In this case, we change the left boundary l to mid. Otherwise, we change the right boundary r to mid-1. When the left boundary l is equal to the right boundary r, we find the largest h value, which is l or r.
Time complexity O(n × log n), where n is the length of array citations. Space complexity O(1).
class Solution: def hIndex(self, citations: List[int]) -> int: l, r = 0, len(citations) while l < r: mid = (l + r + 1) >> 1 if sum(x >= mid for x in citations) >= mid: l = mid else: r = mid - 1 return l(code-box)
